Novel pesticidal toxins and nucleotide sequences which encode these toxins

ABSTRACT

Disclosed and claimed are novel  Bacillus thuringiensis  isolates, pesticidal toxins, genes, and nucleotide probes and primers for the identification of genes encoding toxins active against pests. The primers are useful in PCR techniques to produce gene fragments which are characteristic of genes encoding these toxins. The subject invention provides entirely new families of toxins from Bacillus isolates.

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application is a continuation of co-pending application Ser.No. 09/073,898, filed May 6, 1998; which is a continuation-in-part ofSer. No. 08/960,780, filed Oct. 30, 1997, now U.S. Pat. No. 6,204,435;which claims priority from provisional application Ser. No. 60/029,848,filed Oct. 30, 1996.

BACKGROUND OF THE INVENTION

[0002] The soil microbe Bacillus thuringiensis (B.t.) is aGram-positive, spore-forming bacterium characterized by parasporalcrystalline protein inclusions. These inclusions often appearmicroscopically as distinctively shaped crystals. The proteins can behighly toxic to pests and specific in their toxic activity. Certain B.t.toxin genes have been isolated and sequenced, and recombinant DNA-basedB.t. products have been produced and approved for use. In addition, withthe use of genetic engineering techniques, new approaches for deliveringthese B.t. endotoxins to agricultural environments are underdevelopment, including the use of plants genetically engineered withendotoxin genes for insect resistance and the use of stabilized intactmicrobial cells as B.t. endotoxin delivery vehicles (Gaertner, F. H., L.Kim [1988] TIBTECH 6:S4-S7). Thus, isolated B.t. endotoxin genes arebecoming commercially valuable.

[0003] Until the last fifteen years, commercial use of B.t. pesticideshas been largely restricted to a narrow range of lepidopteran(caterpillar) pests. Preparations of the spores and crystals of B.thuringiensis subsp. kurstaki have been used for many years ascommercial insecticides for lepidopteran pests. For example, B.thuringiensis var. kurstaki HD-1 produces a crystalline δ-endotoxinwhich is toxic to the larvae of a number of lepidopteran insects.

[0004] In recent years, however, investigators have discovered B.t.pesticides with specificities for a much broader range of pests. Forexample, other species of B.t., namely israelensis and morrisoni (a.k.a.tenebrionis, a.k.a. B.t. M-7, a.k.a. B.t. san diego), have been usedcommercially to control insects of the orders Diptera and Coleoptera,respectively (Gaertner, F. H. [1989] “Cellular Delivery Systems forInsecticidal Proteins: Living and Non-Living Microorganisms,” inControlled Delivery of Crop Protection Agents, R. M. Wilkins, ed.,Taylor and Francis, New York and London, 1990, pp. 245-255.). See alsoCouch, T. L. (1980) “Mosquito Pathogenicity of Bacillus thuringiensisvar. israelensis,” Developments in Industrial Microbiology 22:61-76; andBeegle, C. C. (1978) “Use of Entomogenous Bacteria in Agroecosystems,”Developments in Industrial Microbiology 20:97-104. Krieg, A., A. M.Huger, G. A. Langenbruch, W. Schnetter (1983) Z. ang. Ent. 96:500-508describe Bacillus thuringiensis var. tenebrionis, which is reportedlyactive against two beetles in the order Coleoptera. These are theColorado potato beetle, Leptinotarsa decemlineata, and Agelastica alni.

[0005] More recently, new subspecies of B.t. have been identified, andgenes responsible for active δ-endotoxin proteins have been isolated(Höfte, H., H. R. Whiteley [1989] Microbiological Reviews52(2):242-255). Höfte and Whiteley classified B.t. crystal protein genesinto four major classes. The classes were Cryl (Lepidoptera-specific),CryII (Lepidoptera- and Diptera-specific), CryIII (Coleoptera-specific),and CryIV (Diptera-specific). The discovery of strains specificallytoxic to other pests has been reported (Feitelson, J. S., J. Payne, L.Kim [1992] Bio/Technology 10:271-275). CryV has been proposed todesignate a class of toxin genes that are nematode-specific. Lambert etal. (Lambert, B., L. Buysse, C. Decock, S. Jansens, C. Piens, B. Saey,J. Seurinck, K. van Audenhove, J. Van Rie, A. Van Vliet, M. Peferoen[1996] Appl. Environ. Microbiol 62(1):80-86) describe thecharacterization of a Cry9 toxin active against lepidopterans. PublishedPCT applications WO 94/05771 and WO 94/24264 also describe B.t. isolatesactive against lepidopteran pests. Gleave et al. ([1991] JGM 138:55-62),Shevelev et al. ([1993] FEBS Lett. 336:79-82; and Smulevitch et al.([1991] FEBS Lett. 293:25-26) also describe B.t. toxins. Many otherclasses of B.t. genes have now been identified.

[0006] The cloning and expression of a B.t. crystal protein gene inEscherichia coli has been described in the published literature(Schnepf, H. E., H. R. Whiteley [1981] Proc. Natl. Acad. Sci. USA78:2893-2897.). U.S. Pat. Nos. 4,448,885 and 4,467,036 both disclose theexpression of B.t. crystal protein in E. coli. U.S. Pat. Nos. 4,990,332;5,039,523; 5,126,133; 5,164,180; and 5,169,629 are among those whichdisclose B.t. toxins having activity against lepidopterans. PCTapplication WO96/05314 discloses PS86W1, PS86V1, and other B.t. isolatesactive against lepidopteran pests. The PCT patent applications publishedas WO94/24264 and WO94/05771 describe B.t. isolates and toxins activeagainst lepidopteran pests. B.t. proteins with activity against membersof the family Noctuidae are described by Lambert et al, supra. U.S. Pat.Nos. 4,797,276 and 4,853,331 disclose B. thuringiensis straintenebrionis which can be used to control coleopteran pests in variousenvironments. U.S. Pat. No. 4,918,006 discloses B.t. toxins havingactivity against dipterans. U.S. Pat. Nos. 5,151,363 and 4,948,734disclose certain isolates of B.t. which have activity against nematodes.Other U.S. patents which disclose activity against nematodes includeU.S. Pat. Nos. 5,093,120; 5,236,843; 5,262,399; 5,270,448; 5,281,530;5,322,932; 5,350,577; 5,426,049; 5,439,881, 5,667,993; and 5,670,365. Asa result of extensive research and investment of resources, otherpatents have issued for new B.t. isolates and new uses of B.t. isolates.See Feitelson et al., supra, for a review. However, the discovery of newB.t. isolates and new uses of known B.t. isolates remains an empirical,unpredictable art.

[0007] Isolating responsible toxin genes has been a slow empiricalprocess. Carozzi et al. (Carozzi, N. B., V. C. Kramer, G. W. Warren, S.Evola, G. Koziel (1991) Appl. Env. Microbiol. 57(11):3057-3061) describemethods for identifying toxin genes. U.S. Pat. No. 5,204,237 describesspecific and universal probes for the isolation of B.t. toxin genes.That patent, however, does not describe the probes and primers of thesubject invention.

[0008] WO 94/21795, WO 96/10083, and Estruch, J. J. et al. (1996) PNAS93:5389-5394 describe toxins obtained from Bacillus microbes. Thesetoxins are reported to be produced during vegetative cell growth andwere thus termed vegetative insecticidal proteins (VIP). These toxinswere reported to be distinct from crystal-forming δ-endotoxins. Activityof these toxins against lepidopteran and coleopteran pests was reported.These applications make specific reference to toxins designatedVip1A(a), Vip1A(b), Vip2A(a), Vip2A(b), Vip3A(a), and Vip3A(b). Thetoxins and genes of the current invention are distinct from thosedisclosed in the '795 and '083 applications and the Estruch article.

BRIEF SUMMARY OF THE INVENTION

[0009] The subject invention concerns materials and methods useful inthe control of non-mammalian pests and, particularly, plant pests. Inone embodiment, the subject invention provides novel B.t. isolateshaving advantageous activity against non-mammalian pests. In a furtherembodiment, the subject invention provides new toxins useful for thecontrol of non-mammalian pests. In a preferred embodiment, these pestsare lepidopterans and/or coleopterans. The toxins of the subjectinvention include δ-endotoxins as well as soluble toxins which can beobtained from the supernatant of Bacillus cultures.

[0010] The subject invention further provides nucleotide sequences whichencode the toxins of the subject invention. The subject inventionfurther provides nucleotide sequences and methods useful in theidentification and characterization of genes which encode pesticidaltoxins.

[0011] In one embodiment, the subject invention concerns uniquenucleotide sequences which are useful as hybridization probes and/orprimers in PCR techniques. The primers produce characteristic genefragments which can be used in the identification, characterization,and/or isolation of specific toxin genes. The nucleotide sequences ofthe subject invention encode toxins which are distinct frompreviously-described toxins.

[0012] In a specific embodiment, the subject invention provides newclasses of toxins having advantageous pesticidal activities. Theseclasses of toxins can be encoded by polynucleotide sequences which arecharacterized by their ability to hybridize with certain exemplifiedsequences and/or by their ability to be amplified by PCR using certainexemplified primers.

[0013] One aspect of the subject invention pertains to theidentification and characterization of entirely new families of Bacillusthuringiensis toxins having advantageous pesticidal properties. Specificnew toxin families of the subject invention include MIS-1, MIS-2, MIS-3,MIS-4, MIS-5, MIS-6, MIS-7, MIS-8, WAR-1, and SUP-1. These families oftoxins, and the genes which encode them, can be characterized in termsof, for example, the size of the toxin or gene, the DNA or amino acidsequence, pesticidal activity, and/or antibody reactivity. With regardto the genes encoding the novel toxin families of the subject invention,the current disclosure provides unique hybridization probes and PCRprimers which can be used to identify and characterize DNA within eachof the exemplified families.

[0014] In one embodiment of the subject invention, Bacillus isolates canbe cultivated under conditions resulting in high multiplication of themicrobe. After treating the microbe to provide single-stranded genomicnucleic acid, the DNA can be contacted with the primers of the inventionand subjected to PCR amplification. Characteristic fragments oftoxin-encoding genes will be amplified by the procedure, thusidentifying the presence of the toxin-encoding gene(s).

[0015] A further aspect of the subject invention is the use of thedisclosed nucleotide sequences as probes to detect genes encodingBacillus toxins which are active against pests.

[0016] Further aspects of the subject invention include the genes andisolates identified using the methods and nucleotide sequences disclosedherein. The genes thus identified encode toxins active against pests.Similarly, the isolates will have activity against these pests. In apreferred embodiment, these pests are lepidopteran or coleopteran pests.

[0017] In a preferred embodiment, the subject invention concerns plantscells transformed with at least one polynucleotide sequence of thesubject invention such that the transformed plant cells expresspesticidal toxins in tissues consumed by target pests. As describedherein, the toxins useful according to the subject invention may bechimeric toxins produced by combining portions of multiple toxins. Inaddition, mixtures and/or combinations of toxins can be used accordingto the subject invention.

[0018] Transformation of plants with the genetic constructs disclosedherein can be accomplished using techniques well known to those skilledin the art and would typically involve modification of the gene tooptimize expression of the toxin in plants.

[0019] Alternatively, the Bacillus isolates of the subject invention, orrecombinant microbes expressing the toxins described herein, can be usedto control pests. In this regard, the invention includes the treatmentof substantially intact Bacillus cells, and/or recombinant cellscontaining the expressed toxins of the invention, treated to prolong thepesticidal activity when the substantially intact cells are applied tothe environment of a target pest. The treated cell acts as a protectivecoating for the pesticidal toxin. The toxin becomes active uponingestion by a target insect.

BRIEF DESCRIPTION OF THE SEQUENCES

[0020] SEQ ID NO. 1 is a forward primer, designated “the 339 forwardprimer,” used according to the subject invention.

[0021] SEQ ID NO. 2 is a reverse primer, designated “the 339 reverseprimer,” used according to the subject invention.

[0022] SEQ ID NO. 3 is a nucleotide sequence encoding a toxin from B.t.strain PS36A.

[0023] SEQ ID NO. 4 is an amino acid sequence for the 36A toxin.

[0024] SEQ ID NO. 5 is a nucleotide sequence encoding a toxin from B.t.strain PS81F.

[0025] SEQ ID NO. 6 is an amino acid sequence for the 81F toxin.

[0026] SEQ ID NO. 7 is a nucleotide sequence encoding a toxin from B.t.strain Javelin 1990.

[0027] SEQ ID NO. 8 is an amino acid sequence for the Javelin 1990toxin.

[0028] SEQ ID NO. 9 is a forward primer, designated “158C2 PRIMER A,”used according to the subject invention.

[0029] SEQ ID NO. 10 is a nucleotide sequence encoding a portion of asoluble toxin from B.t. PS158C2.

[0030] SEQ ID NO. 11 is a forward primer, designated “49C PRIMER A,”used according to the subject invention.

[0031] SEQ ID NO. 12 is a nucelotide sequence of a portion of a toxingene from B.t. strain PS49C.

[0032] SEQ ID NO. 13 is a forward primer, designated “49C PRIMER B,”used according to the subject invention.

[0033] SEQ ID NO. 14 is a reverse primer, designated “49C PRIMER C,”used according to the subject invention.

[0034] SEQ ID NO. 15 is an additional nucleotide sequence of a portionof a toxin gene from PS49C.

[0035] SEQ ID NO. 16 is a forward primer used according to the subjectinvention.

[0036] SEQ ID NO. 17 is a reverse primer used according to the subjectinvention.

[0037] SEQ ID NO. 18 is a nucleotide sequence of a toxin gene from B.t.strain PS10E1.

[0038] SEQ ID NO. 19 is an amino acid sequence from the 10E1 toxin.

[0039] SEQ ID NO. 20 is a nucleotide sequence of a toxin gene from B.t.strain PS31J2.

[0040] SEQ ID NO. 21 is an amino acid sequence from the 31J2 toxin.

[0041] SEQ ID NO. 22 is a nucleotide sequence of a toxin gene from B.t.strain PS33D2.

[0042] SEQ ID NO. 23 is an amino acid sequence from the 33D2 toxin.

[0043] SEQ ID NO. 24 is a nucleotide sequence of a toxin gene from B.t.strain PS66D3.

[0044] SEQ ID NO. 25 is an amino acid sequence from the 66D3 toxin.

[0045] SEQ ID NO. 26 is a nucleotide sequence of a toxin gene from B.t.strain PS68F.

[0046] SEQ ID NO. 27 is an amino acid sequence from the 68F toxin.

[0047] SEQ ID NO. 28 is a nucleotide sequence of a toxin gene from B.t.strain PS69AA2.

[0048] SEQ ID NO. 29 is an amino acid sequence from the 69AA2 toxin.

[0049] SEQ ID NO. 30 is a nucleotide sequence of a toxin gene from B.t.strain PS168G1.

[0050] SEQ ID NO. 31 is a nucleotide sequence of a MIS toxin gene fromB.t. strain PS177C8.

[0051] SEQ ID NO. 32 is an amino acid sequence from the 177C8-MIS toxin.

[0052] SEQ ID NO. 33 is a nucleotide sequence of a toxin gene from B.t.strain PS177I8.

[0053] SEQ ID NO. 34 is an amino acid sequence from the 177I8 toxin.

[0054] SEQ ID NO. 35 is a nucleotide sequence of a toxin gene from B.t.strain PS185AA2.

[0055] SEQ ID NO. 36 is an amino acid sequence from the 185AA2 toxin.

[0056] SEQ ID NO. 37 is a nucleotide sequence of a toxin gene from B.t.strain PS196F3.

[0057] SEQ ID NO. 38 is an amino acid sequence from the 196F3 toxin.

[0058] SEQ ID NO. 39 is a nucleotide sequence of a toxin gene from B.t.strain PS196J4.

[0059] SEQ ID NO. 40 is an amino acid sequence from the 196J4 toxin.

[0060] SEQ ID NO. 41 is a nucleotide sequence of a toxin gene from B.t.strain PS197T1.

[0061] SEQ ID NO. 42 is an amino acid sequence from the 197T1 toxin.

[0062] SEQ ID NO. 43 is a nucleotide sequence of a toxin gene from B.t.strain PS197U2.

[0063] SEQ ID NO. 44 is an amino acid sequence from the 197U2 toxin.

[0064] SEQ ID NO. 45 is a nucleotide sequence of a toxin gene from B.t.strain PS202E1.

[0065] SEQ ID NO. 46 is an amino acid sequence from the 202E1 toxin.

[0066] SEQ ID NO. 47 is a nucleotide sequence of a toxin gene from B.t.strain KB33.

[0067] SEQ ID NO. 48 is a nucleotide sequence of a toxin gene from B.t.strain KB38.

[0068] SEQ ID NO. 49 is a forward primer, designated “ICON-forward,”used according to the subject invention.

[0069] SEQ ID NO. 50 is a reverse primer, designated “ICON-reverse,”used according to the subject invention.

[0070] SEQ ID NO. 51 is a nucleotide sequence encoding a 177C8-WAR toxingene from B.t. strain PS177C8.

[0071] SEQ ID NO. 52 is an amino acid sequence of a 177C8-WAR toxin fromB.t. strain PS177C8.

[0072] SEQ ID NO.53 is a forward primer, designated “SUP-1A,” usedaccording to the subject invention.

[0073] SEQ ID NO.54 is a reverse primer, designated “SUP-1B,” usedaccording to the subject invention.

[0074] SEQ ID NOS. 55-110 are primers used according to the subjectinvention.

[0075] SEQ ID NO. 111 is the reverse complement of the primer of SEQ IDNO. 58.

[0076] SEQ ID NO. 112 is the reverse complement of the primer of SEQ IDNO. 60.

[0077] SEQ ID NO. 113 is the reverse complement of the primer of SEQ IDNO. 64.

[0078] SEQ ID NO. 114 is the reverse complement of the primer of SEQ IDNO. 66.

[0079] SEQ ID NO. 115 is the reverse complement of the primer of SEQ IDNO. 68.

[0080] SEQ ID NO. 116 is the reverse complement of the primer of SEQ IDNO. 70.

[0081] SEQ ID NO. 117 is the reverse complement of the primer of SEQ IDNO. 72.

[0082] SEQ ID NO. 118 is the reverse complement of the primer of SEQ IDNO. 76.

[0083] SEQ ID NO. 119 is the reverse complement of the primer of SEQ IDNO. 78.

[0084] SEQ ID NO. 120 is the reverse complement of the primer of SEQ IDNO. 80.

[0085] SEQ ID NO. 121 is the reverse complement of the primer of SEQ IDNO. 82.

[0086] SEQ ID NO. 122 is the reverse complement of the primer of SEQ IDNO. 84.

[0087] SEQ ID NO. 123 is the reverse complement of the primer of SEQ IDNO. 86.

[0088] SEQ ID NO. 124 is the reverse complement of the primer of SEQ IDNO. 88.

[0089] SEQ ID NO. 125 is the reverse complement of the primer of SEQ IDNO. 92.

[0090] SEQ ID NO. 126 is the reverse complement of the primer of SEQ IDNO. 94.

[0091] SEQ ID NO. 127 is the reverse complement of the primer of SEQ IDNO. 96.

[0092] SEQ ID NO. 128 is the reverse complement of the primer of SEQ IDNO. 98.

[0093] SEQ ID NO. 129 is the reverse complement of the primer of SEQ IDNO. 99.

[0094] SEQ ID NO. 130 is the reverse complement of the primer of SEQ IDNO. 100.

[0095] SEQ ID NO. 131 is the reverse complement of the primer of SEQ IDNO. 104.

[0096] SEQ ID NO. 132 is the reverse complement of the primer of SEQ IDNO. 106.

[0097] SEQ ID NO. 133 is the reverse complement of the primer of SEQ IDNO. 108.

[0098] SEQ ID NO. 134 is the reverse complement of the primer of SEQ IDNO. 110.

[0099] SEQ ID NO. 135 is a MIS-7 forward primer.

[0100] SEQ ID NO. 136 is a MIS-7 reverse primer.

[0101] SEQ ID NO. 137 is a MIS-8 forward primer.

[0102] SEQ ID NO. 138 is a MIS-8 reverse primer.

[0103] SEQ ID NO. 139 is a nucleotide sequence of a MIS-7 toxin genedesignated 157C1-A from B.t. strain PS157C1.

[0104] SEQ ID NO. 140 is an amino acid sequence of a MIS-7 toxindesignated 157C1-A from B.t. strain PS157C1.

[0105] SEQ ID NO. 141 is a nucleotide sequence of a MIS-7 toxin genefrom B.t. strain PS201Z.

[0106] SEQ ID NO. 142 is a nucleotide sequence of a MIS-8 toxin genefrom B.t. strain PS31F2.

[0107] SEQ ID NO. 143 is a nucleotide sequence of a MIS-8 toxin genefrom B.t. strain PS185Y2.

[0108] SEQ ID NO. 144 is a nucleotide sequence of a MIS-1 toxin genefrom B.t. strain PS33F1.

DETAILED DISCLOSURE OF THE INVENTION

[0109] The subject invention concerns materials and methods for thecontrol of non-mammalian pests. In specific embodiments, the subjectinvention pertains to new Bacillus thuringiensis isolates and toxinswhich have activity against lepidopterans and/or coleopterans. Thesubject invention further concerns novel genes which encode pesticidaltoxins and novel methods for identifying and characterizing Bacillusgenes which encode toxins with useful properties. The subject inventionconcerns not only the polynucleotide sequences which encode thesetoxins, but also the use of these polynucleotide sequences to producerecombinant hosts which express the toxins. The proteins of the subjectinvention are distinct from protein toxins which have previously beenisolated from Bacillus thuringiensis.

[0110] B.t. isolates useful according to the subject invention have beendeposited in the permanent collection of the Agricultural ResearchService Patent Culture Collection (NRRL), Northern Regional ResearchCenter, 1815 North University Street, Peoria, Ill. 61604, USA. Theculture repository numbers of the B.t. strains are as follows: TABLE 1Repository Culture No. Deposit Date Patent No. B.t. PS11B (MT274) NRRLB- April 18, 1996 21556 B.t. PS24J NRRL B- August 30, 1991 18881 B.t.PS31G1 (MT278) NRRL B- April 18, 1996 21560 B.t. PS36A NRRL B- December27, 1991 18929 B.t. PS33F2 NRRL B- July 28, 1987 4,861,595 18244 B.t.PS40D1 NRRL B- February 3, 1988 5,098,705 18300 B.t. PS43F NRRL B-February 2, 1988 4,996,155 18298 B.t. PS45B1 NRRL B- August 16, 19885,427,786 18396 B.t. PS49C NRRL B- March 14, 1996 21532 B.t. PS52A1 NRRLB- July 28, 1987 4,861,595 18245 B.t. PS62B1 NRRL B- August 16, 19884,849,217 18398 B.t. PS81A2 NRRL B- April 19, 1989 5,164,180 18484 B.t.PS81F NRRL B- October 7, 1988 5,045,469 18424 B.t. PS81GG NRRL B-October 11, 1988 5,169,629 18425 B.t PS81I NRRL B- April 19, 19895,126,133 18484 B.t. PS85A1 NRRL B- October 11, 1988 18426 B.t. PS86A1NRRL B- August 16, 1988 4,849,217 18400 B.t. PS86B1 NRRL B- February 2,1988 4,966,765 18299 B.t. PS86BB1 (MT275) NRRL B- April 18, 1996 21557B.t. PS86Q3 NRRL B- February 6, 1991 5,208,017 18765 B.t. PS86V1 (MT276)NRRL B- April 18, 1996 21558 B.t. PS86W1 (MT277) NRRL B- April 18, 199621559 B.t. PS89J3 (MT279) NRRL B- April 18, 1996 21561 B.t. PS91C2 NRRLB- February 6, 1991 18931 B.t. PS92B NRRL B- September 23, 19915,427,786 18889 B.t. PS101Z2 NRRL B- October 1, 1991 5,427,786 18890B.t. PS122D3 NRRL B- June 9, 1988 5,006,336 18376 B.t. PS123D1 NRRL B-October 13, 1992 5,508,032 21011 B.t. PS157C1 (MT104) NRRL B- July 17,1987 5,262,159 18240 B.t. PS158C2 NRRL B- August 27, 1991 5,268,17218872 B.t. PS169E NRRL B- July 17, 1990 5,151,363 18682 B.t. PS177F1NRRL B- July 17, 1990 5,151,363 18683 B.t. PS177G NRRL B- July 17, 19905,151,363 18684 B.t. PS185L2 NRRL B- March 14, 1996 21535 B.t. PS185U2(MT280) NRRL B- April 18, 1996 21562 B.t. PS192M4 NRRL B- December 27,1991 5,273,746 18932 B.t. PS201L1 NRRL B- January 9, 1991 5,298,24518749 B.t. PS204C3 NRRL B- October 6, 1992 21008 B.t. PS204G4 NRRL B-July 17, 1990 5,262,399 18685 B.t. PS242H10 NRRL B- March 14, 1996 21439B.t. PS242K17 NRRB B- March 14, 1996 21540 B.t. PS244A2 NRRB B- March14, 1996 21541 B.t. PS244D1 NRRL B- March 14, 1996 21542 B.t. PS10E1NRRL B- October 24, 1997 21862 B.t. PS31F2 NRRL B- October 24, 199721876 B.t. PS31J2 NRRL B- October 13, 1992 21009 B.t. PS33D2 NRRL B-October 24, 1997 21870 B.t. PS66D3 NRRL B- October 24, 1997 21858 B.t.PS68F NRRL B- October 24, 1997 21857 B.t. PS69AA2 NRRL B- October 24,1997 21859 B.t. PS146D NRRL B- October 24, 1997 21866 B.t. PS168G1 NRRLB- October 24, 1997 21873 B.t. PS17514 NRRL B- October 24, 1997 21865B.t. PS177C8a NRRL B- October 24, 1997 21867 B.t. PS17718 NRRL B-October 24, 1997 21868 B.t. PS185AA2 NRRL B- October 24, 1997 21861 B.t.PS196J4 NRRL B- October 24, 1997 21860 B.t. PS196F3 NRRL B- October 24,1997 21872 B.t. PS197T1 NRRL B- October 24, 1997 21869 B.t. PS197U2 NRRLB- October 24, 1997 21871 B.t. PS202E1 NRRL B- October 24, 1997 21874B.t. PS217U2 NRRL B- October 24, 1997 21864 KB33 NRRL B- October 24,1997 21875 K1B38 NRRL B- October 24, 1997 21863 KB53A49-4 NRRL B-October 24, 1997 21879 KB68B46-2 NRRL B- October 24, 1997 21877KB68B51-2 NRRL B- October 24, 1997 21880 K1B68B55-2 NRRL B- October 24,1997 21878 PS80JJ1 NRRL B- July 17, 1990 5,151,363 18679 PS94R1 NRRL B-July 1, 1997 21801 PS101DD NRRL B- July 1, 1997 21802 PS202S NRRL B-July 1, 1997 21803 PS213E5 NRRL B- July 1, 1997 21804 PS218G2 NRRL B-July 1, 1997 21805 PS33F1 NRRL B- April 24, 1998 21977 PS71G4 NRRL B-April 24, 1998 21978 PS86D1 NRRL B- April 24, 1998 21979 PS185V2 NRRL B-April 24, 1998 21980 PS191A21 NRRL B- April 24, 1998 21981 PS201Z NRRLB- April 24, 1998 21982 PS205A3 NRRL B- April 24, 1998 21983 PS205C NRRLB- April 24, 1998 21984 PS234E1 NRRL B- April 24, 1998 21985 PS248N10NRRL B- April 24, 1998 21986 KB63B19-13 NRRL B- April 29, 1998 21990KB63B19-7 NRRL B- April 29, 1998 21989 KB68B62-7 NRRL B- April 29, 199821991 KB68B63-2 NRRL B- April 29, 1998 21992 KB69A125-1 NRRL B- April29, 1998 21993 KB69A125-3 NRRL B- April 29, 1998 21994 KB69A125-5 NRRLB- April 29, 1998 21995 KB69A127-7 NRRL B- April 29, 1998 21996KB69A132-1 NRRL B- April 29, 1998 21997 KB69B2-1 NRRL B- April 29, 199821998 KB70B5-3 NRRL B- April 29, 1998 21999 KB71A125-15 NRRL B- April29, 1998 30001 KB71A35-6 NRRL B- April 29, 1998 30000 KB71A72-1 NRRL B-April 29, 1998 21987 KB71A134-2 NRRL B- April 29, 1998 21988

[0111] Cultures which have been deposited for the purposes of thispatent application were deposited under conditions that assure thataccess to the cultures is available during the pendency of this patentapplication to one determined by the Commissioner of Patents andTrademarks to be entitled thereto under 37 CFR 1.14 and 35 U.S.C. 122.The deposits will be available as required by foreign patent laws incountries wherein counterparts of the subject application, or itsprogeny, are filed. However, it should be understood that theavailability of a deposit does not constitute a license to practice thesubject invention in derogation of patent rights granted by governmentalaction.

[0112] Further, the subject culture deposits will be stored and madeavailable to the public in accord with the provisions of the BudapestTreaty for the Deposit of Microorganisms, i.e., they will be stored withall the care necessary to keep them viable and uncontaminated for aperiod of at least five years after the most recent request for thefurnishing of a sample of the deposit, and in any case, for a period ofat least thirty (30) years after the date of deposit or for theenforceable life of any patent which may issue disclosing theculture(s). The depositor acknowledges the duty to replace thedeposit(s) should the depository be unable to furnish a sample whenrequested, due to the condition of a deposit. All restrictions on theavailability to the public of the subject culture deposits will beirrevocably removed upon the granting of a patent disclosing them.

[0113] Many of the strains useful according to the subject invention arereadily available by virtue of the issuance of patents disclosing thesestrains or by their deposit in public collections or by their inclusionin commercial products. For example, the B.t. strain used in thecommercial product, Javelin, and the HD isolates are all publiclyavailable.

[0114] Mutants of the isolates referred to herein can be made byprocedures well known in the art. For example, an asporogenous mutantcan be obtained through ethylmethane sulfonate (EMS) mutagenesis of anisolate. The mutants can be made using ultraviolet light andnitrosoguanidine by procedures well known in the art.

[0115] In one embodiment, the subject invention concerns materials andmethods including nucleotide primers and probes for isolating,characterizing, and identifying Bacillus genes encoding protein toxinswhich are active against non-mammalian pests. The nucleotide sequencesdescribed herein can also be used to identify new pesticidal Bacillusisolates. The invention further concerns the genes, isolates, and toxinsidentified using the methods and materials disclosed herein.

[0116] The new toxins and polynucleotide sequences provided here aredefined according to several parameters. One characteristic of thetoxins described herein is pesticidal activity. In a specificembodiment, these toxins have activity against coleopteran and/orlepidopteran pests. The toxins and genes of the subject invention can befurther defined by their amino acid and nucleotide sequences. Thesequences of the molecules can be defined in terms of homology tocertain exemplified sequences as well as in terms of the ability tohybridize with, or be amplified by, certain exemplified probes andprimers. The toxins provided herein can also be identified based ontheir immunoreactivity with certain antibodies.

[0117] An important aspect of the subject invention is theidentification and characterization of new families of Bacillus toxins,and genes which encode these toxins. These families have been designatedMIS-1, MIS-2, MIS-3, MIS-4, MIS-5, MIS-6, MIS-7, MIS-8, WAR-1, andSUP-1. Toxins within these families, as well as genes encoding toxinswithin these families, can readily be identified as described herein by,for example, size, amino acid or DNA sequence, and antibody reactivity.Amino acid and DNA sequence characteristics include homology withexemplified sequences, ability to hybridize with DNA probes, and abilityto be amplified with specific primers.

[0118] The MIS-1 family of toxins includes toxins from isolates PS68Fand PS33F1. Also provided are hybridization probes and PCR primers whichspecifically identify genes falling in the MIS-1 family.

[0119] A second family of toxins identified herein is the MIS-2 family.This family includes toxins which can be obtained from isolates PS66D3,PS197T1, and PS31J2. The subject invention further provides probes andprimers for the identification of MIS-2 toxins and genes.

[0120] A third family of toxins identified herein is the MIS-3 family.This family includes toxins which can be obtained from B.t. isolatesPS69AA2 and PS33D2. The subject invention further provides probes andprimers for identification of the MIS-3 genes and toxins.

[0121] Polynucleotide sequences encoding MIS-4 toxins can be obtainedfrom the B.t. isolate designated PS197U2. The subject invention furtherprovides probes and primers for the identification of genes and toxinsin this family.

[0122] A fifth family of toxins identified herein is the MIS-5 family.This family includes toxins which can be obtained from B.t. isolatesKB33 and KB38. The subject invention further provides probes and primersfor identification of the MIS-5 genes and toxins.

[0123] A sixth family of toxins identified herein is the MIS-6 family.This family includes toxins which can be obtained from B.t. isolatesPS196F3, PS168G1, PS196J4, PS202E1, PS10E1, and PS185AA2. The subjectinvention further provides probes and primers for identification of theMIS-6 genes and toxins.

[0124] A seventh family of toxins identified herein is the MIS-7 family.This family includes toxins which can be obtained from B.t. isolatesPS157C1, PS205C, and PS201Z. The subject invention further providesprobes and primers for identification of the MIS-7 genes and toxins.

[0125] An eighth family of toxins identified herein is the MIS-8 family.This family includes toxins which can be obtained from B.t. isolatesPS31F2 and PS185Y2. The subject invention further provides probes andprimers for identification of the MIS-8 genes and toxins.

[0126] In a preferred embodiment, the genes of the MIS family encodetoxins having a molecular weight of about 70 to about 100 kDa and, mostpreferably, the toxins have a size of about 80 kDa. Typically, thesetoxins are soluble and can be obtained from the supernatant of Bacilluscultures as described herein. These toxins have toxicity againstnon-mammalian pests. In a preferred embodiment, these toxins haveactivity against coleopteran pests. The MIS proteins are further usefuldue to their ability to form pores in cells. These proteins can be usedwith second entities including, for example, other proteins. When usedwith a second entity, the MIS protein will facilitate entry of thesecond agent into a target cell. In a preferred embodiment, the MISprotein interacts with MIS receptors in a target cell and causes poreformation in the target cell. The second entity may be a toxin oranother molecule whose entry into the cell is desired.

[0127] The subject invention further concerns a family of toxinsdesignated WAR-1. The WAR-1 toxins typically have a size of about 30-50kDa and, most typically, have a size of about 40 kDa. Typically, thesetoxins are soluble and can be obtained from the supernatant of Bacilluscultures as described herein. The WAR-1 toxins can be identified withprimers described herein as well as with antibodies. In a specificembodiment, the antibodies can be raised to, for example, toxin fromisolate PS177C8.

[0128] An additional family of toxins provided according to the subjectinvention are the toxins designated SUP-1. Typically, these toxins aresoluble and can be obtained from the supernatant of Bacillus cultures asdescribed herein. In a preferred embodiment, the SUP-1 toxins are activeagainst lepidopteran pests. The SUP-1 toxins typically have a size ofabout 70-100 kDa and, preferably, about 80 kDa. The SUP-1 family isexemplified herein by toxins from isolates PS49C and PS158C2. Thesubject invention provides probes and primers useful for theidentification of toxins and genes in the SUP-1 family

[0129] The subject invention further provides specific Bacillus toxinsand genes which did not fall into any of the new families disclosedherein. These specific toxins and genes include toxins and genes whichcan be obtained from PS177C8 and PS177I8.

[0130] Toxins in the MIS, WAR, and SUP families are all soluble and canbe obtained as described herein from the supernatant of Bacilluscultures. These toxins can be used alone or in combination with othertoxins to control pests. For example, toxins from the MIS families maybe used in conjunction with WAR-type toxins to achieve control of pests,particularly coleopteran pests. These toxins may be used, for example,with δ-endotoxins which are obtained from Bacillus isolates.

[0131] Table 2 provides a summary of the novel families of toxins andgenes of the subject invention. Each of the eight MIS families isspecifically exemplified herein by toxins which can be obtained fromparticular B.t. isolates as shown in Table 2. Genes encoding toxins ineach of these families can be identified by a variety of highly specificparameters, including the ability to hybridize with the particularprobes set forth in Table 2. Sequence identity in excess of about 80%with the probes set forth in Table 2 can also be used to identify thegenes of the various families. Also exemplified are particular primerpairs which can be used to amplify the genes of the subject invention. Aportion of a gene within the indicated families would typically beamplifiable with at least one of the enumerated primer pairs. In apreferred embodiment, the amplified portion would be of approximatelythe indicated fragment size. Primers shown in Table 2 consist ofpolynucleotide sequences which encode peptides as shown in the sequencelisting attached hereto. Additional primers and probes can readily beconstructed by those skilled in the art such that alternatepolynucleotide sequences encoding the same amino acid sequences can beused to identify and/or characterize additional genes encodingpesticidal toxins. In a preferred embodiment, these additional toxins,and their genes, could be obtained from Bacillus isolates. TABLE 2Probes Primer Pairs Fragment size Family Isolates (SEQ ID NO.) (SEQ IDNOS.) (nt) MIS-1 PS68F, PS33F1 26, 144 56 and 111 69 56 and 112 506 58and 112 458 MIS-2 PS66D3, PS197T1, P531J2 24, 41, 20 62 and 113 160 62and 114 239 62 and 115 400 62 and 116 509 62 and 117 703 64 and 114 10264 and 115 263 64 and 116 372 64 and 117 566 66 and 115 191 66 and 116300 66 and 117 494 68 and 116 131 68 and 117 325 70 and 117 213 MIS-3PS69AA2, P533D2 28, 22 74 and 118 141 74 and 119 376 74 and 120 389 74and 121 483 74 and 122 715 74 and 123 743 74 and 124 902 76 and 119 25376 and 120 266 76 and 121 360 76 and 122 592 76 and 123 620 76 and 124779 78 and 120 31 78 and 121 125 78 and 122 357 78 and 123 385 78 and124 544 80 and 121 116 80 and 122 348 80 and 123 376 80 and 124 535 82and 122 252 82 and 123 280 82 and 124 439 84 and 123 46 84 and 124 20586 and 124 177 MIS-4 PS197U2 43 90 and 125 517 90 and 126 751 90 and 127821 92 and 126 258 92 and 127 328 94 and 127 92 MIS-5 KB33, KB38 47, 4897 and 128 109 97 and 129 379 97 and 130 504 98 and 129 291 98 and 130416 99 and 130 144 MIS-6 PS196F3, P5168G1, P5196J4, 18,30,35,37, 102 and131 66 PS202E1, PS10E1, PS185AA2 39,45 102 and 132 259 102 and 133 245102 and 134 754 104 and 132 213 104 and 133 199 104 and 134 708 106 and133 31 106 and 134 518 108 and 134 526 MIS-7 PS205C, PS157C1 (157C1-A),139, 141 135 and 136 598 PS201Z MIS-8 PS31F2, PS185Y2 142,143 137 and138 585 SUP-1 PS49C, PS158C2 10, 12, 15 53 and 54 370

[0132] Furthermore, chimeric toxins may be used according to the subjectinvention. Methods have been developed for making useful chimeric toxinsby combining portions of B.t. proteins. The portions which are combinedneed not, themselves, be pesticidal so long as the combination ofportions creates a chimeric protein which is pesticidal. This can bedone using restriction enzymes, as described in, for example, EuropeanPatent 0 228 838; Ge, A. Z., N. L. Shivarova, D. H. Dean (1989) Proc.Natl. Acad. Sci. USA 86:4037-4041; Ge, A. Z., D. Rivers, R. Milne, D. H.Dean (1991) J. Biol. Chem. 266:17954-17958; Schnepf, H. E., K. Tomczak,J. P. Ortega, H. R. Whiteley (1990) J. Biol. Chem. 265:20923-20930;Honee, G., D. Convents, J. Van Rie, S. Jansens, M. Peferoen, B. Visser(1991) Mol. Microbiol. 5:2799-2806. Alternatively, recombination usingcellular recombination mechanisms can be used to achieve similarresults. See, for example, Caramori, T., A. M. Albertini, A. Galizzi(1991) Gene 98:37-44; Widner, W. R., H. R. Whiteley (1990) J. Bacteriol.172:2826-2832; Bosch, D., B. Schipper, H. van der Kliej, R. A. de Maagd,W. J. Stickema (1994) Biotechnology 12:915-918. A number of othermethods are known in the art by which such chimeric DNAs can be made.The subject invention is meant to include chimeric proteins that utilizethe novel sequences identified in the subject application.

[0133] With the teachings provided herein, one skilled in the art couldreadily produce and use the various toxins and polynucleotide sequencesdescribed herein.

[0134] Genes and toxins. The genes and toxins useful according to thesubject invention include not only the full length sequences but alsofragments of these sequences, variants, mutants, and fusion proteinswhich retain the characteristic pesticidal activity of the toxinsspecifically exemplified herein. Chimeric genes and toxins, produced bycombining portions from more than one Bacillus toxin or gene, may alsobe utilized according to the teachings of the subject invention. As usedherein, the terms “variants” or “variations” of genes refer tonucleotide sequences which encode the same toxins or which encodeequivalent toxins having pesticidal activity. As used herein, the term“equivalent toxins” refers to toxins having the same or essentially thesame biological activity against the target pests as the exemplifiedtoxins.

[0135] It is apparent to a person skilled in this art that genesencoding active toxins can be identified and obtained through severalmeans. The specific genes exemplified herein may be obtained from theisolates deposited at a culture depository as described above. Thesegenes, or portions or variants thereof, may also be constructedsynthetically, for example, by use of a gene synthesizer. Variations ofgenes may be readily constructed using standard techniques for makingpoint mutations. Also, fragments of these genes can be made usingcommercially available exonucleases or endonucleases according tostandard procedures. For example, enzymes such as Bal31 or site-directedmutagenesis can be used to systematically cut off nucleotides from theends of these genes. Also, genes which encode active fragments may beobtained using a variety of restriction enzymes. Proteases may be usedto directly obtain active fragments of these toxins.

[0136] Equivalent toxins and/or genes encoding these equivalent toxinscan be derived from Bacillus isolates and/or DNA libraries using theteachings provided herein. There are a number of methods for obtainingthe pesticidal toxins of the instant invention. For example, antibodiesto the pesticidal toxins disclosed and claimed herein can be used toidentify and isolate toxins from a mixture of proteins. Specifically,antibodies may be raised to the portions of the toxins which are mostconstant and most distinct from other Bacillus toxins. These antibodiescan then be used to specifically identify equivalent toxins with thecharacteristic activity by immunoprecipitation, enzyme linkedimmunosorbent assay (ELISA), or Western blotting. Antibodies to thetoxins disclosed herein, or to equivalent toxins, or fragments of thesetoxins, can readily be prepared using standard procedures in this art.The genes which encode these toxins can then be obtained from themicroorganism.

[0137] Fragments and equivalents which retain the pesticidal activity ofthe exemplified toxins are within the scope of the subject invention.Also, because of the redundancy of the genetic code, a variety ofdifferent DNA sequences can encode the amino acid sequences disclosedherein. It is well within the skill of a person trained in the art tocreate these alternative DNA sequences encoding the same, or essentiallythe same, toxins. These variant DNA sequences are within the scope ofthe subject invention. As used herein, reference to “essentially thesame” sequence refers to sequences which have amino acid substitutions,deletions, additions, or insertions which do not materially affectpesticidal activity. Fragments retaining pesticidal activity are alsoincluded in this definition.

[0138] A further method for identifying the toxins and genes of thesubject invention is through the use of oligonucleotide probes. Theseprobes are detectable nucleotide sequences. Probes provide a rapidmethod for identifying toxin-encoding genes of the subject invention.The nucleotide segments which are used as probes according to theinvention can be synthesized using a DNA synthesizer and standardprocedures.

[0139] Certain toxins of the subject invention have been specificallyexemplified herein. Since these toxins are merely exemplary of thetoxins of the subject invention, it should be readily apparent that thesubject invention comprises variant or equivalent toxins (and nucleotidesequences coding for equivalent toxins) having the same or similarpesticidal activity of the exemplified toxin. Equivalent toxins willhave amino acid homology with an exemplified toxin. This amino acididentity will typically be greater than 60%, preferably be greater than75%, more preferably greater than 80%, more preferably greater than 90%,and can be greater than 95%. These identities are as determined usingstandard alignment techniques. The amino acid homology will be highestin critical regions of the toxin which account for biological activityor are involved in the determination of three-dimensional configurationwhich ultimately is responsible for the biological activity. In thisregard, certain amino acid substitutions are acceptable and can beexpected if these substitutions are in regions which are not critical toactivity or are conservative amino acid substitutions which do notaffect the three-dimensional configuration of the molecule. For example,amino acids may be placed in the following classes: non-polar, unchargedpolar, basic, and acidic. Conservative substitutions whereby an aminoacid of one class is replaced with another amino acid of the same typefall within the scope of the subject invention so long as thesubstitution does not materially alter the biological activity of thecompound. Table 3 provides a listing of examples of amino acidsbelonging to each class. TABLE 3 Class of Amino Acid Examples of AminoAcids Nonpolar Ala, Val, Leu, Ile, Pro, Met, Phe, Trp Uncharged PolarGly, Ser, Thr, Cys, Tyr, Asn, Gln Acidic Asp, Glu Basic Lys, Arg, His

[0140] In some instances, non-conservative substitutions can also bemade. The critical factor is that these substitutions must notsignificantly detract from the biological activity of the toxin.

[0141] The δ-endotoxins of the subject invention can also becharacterized in terms of the shape and location of toxin inclusions,which are described above.

[0142] As used herein, reference to “isolated” polynucleotides and/or“purified” toxins refers to these molecules when they are not associatedwith the other molecules with which they would be found in nature. Thus,reference to “isolated and purified” signifies the involvement of the“hand of man” as described herein. Chimeric toxins and genes alsoinvolve the “hand of man.”

[0143] Recombinant hosts. The toxin-encoding genes of the subjectinvention can be introduced into a wide variety of microbial or planthosts. Expression of the toxin gene results, directly or indirectly, inthe production and maintenance of the pesticide. With suitable microbialhosts, e.g., Pseudomonas, the microbes can be applied to the situs ofthe pest, where they will proliferate and be ingested. The result is acontrol of the pest. Alternatively, the microbe hosting the toxin genecan be killed and treated under conditions that prolong the activity ofthe toxin and stabilize the cell. The treated cell, which retains thetoxic activity, then can be applied to the environment of the targetpest.

[0144] Where the Bacillus toxin gene is introduced via a suitable vectorinto a microbial host, and said host is applied to the environment in aliving state, it is essential that certain host microbes be used.Microorganism hosts are selected which are known to occupy the“phytosphere” (phylloplane, phyllosphere, rhizosphere, and/orrhizoplane) of one or more crops of interest. These microorganisms areselected so as to be capable of successfully competing in the particularenvironment (crop and other insect habitats) with the wild-typemicroorganisms, provide for stable maintenance and expression of thegene expressing the polypeptide pesticide, and, desirably, provide forimproved protection of the pesticide from environmental degradation andinactivation.

[0145] A large number of microorganisms are known to inhabit thephylloplane (the surface of the plant leaves) and/or the rhizosphere(the soil surrounding plant roots) of a wide variety of important crops.These microorganisms include bacteria, algae, and fungi. Of particularinterest are microorganisms, such as bacteria, e.g., genera Pseudomonas,Erwinia, Serratia, Klebsiella, Xanthomonas, Streptomyces, Rhizobium,Rhodopseudomonas, Methylophilius, Agrobacterium, Acetobacter,Lactobacillus, Arthrobacter, Azotobacter, Leuconostoc, and Alcaligenes;fungi, particularly yeast, e.g., genera Saccharomyces, Cryptococcus,Kluyveromyces, Sporobolomyces, Rhodotorula, and Aureobasidium. Ofparticular interest are such phytosphere bacterial species asPseudomonas syringae, Pseudomonas fluorescens, Serratia marcescens,Acetobacter xylinum, Agrobacterium tumefaciens, Rhodopseudomonasspheroides, Xanthomonas campestris, Rhizobium melioti, Alcaligenesentrophus, and Azotobacter vinlandii; and phytosphere yeast species suchas Rhodotorula rubra, R. glutinis, R. marina, R. aurantiaca,Cryptococcus albidus, C. diffluens, C. laurentii, Saccharomyces rosei,S. pretoriensis, S. cerevisiae, Sporobolomyces roseus, S. odorus,Kluyveromyces veronae, and Aureobasidium pollulans. Of particularinterest are the pigmented microorganisms.

[0146] A wide variety of ways are available for introducing a Bacillusgene encoding a toxin into a microorganism host under conditions whichallow for stable maintenance and expression of the gene. These methodsare well known to those skilled in the art and are described, forexample, in U.S. Pat. No. 5,135,867, which is incorporated herein byreference.

[0147] Synthetic genes which are functionally equivalent to the toxinsof the subject invention can also be used to transform hosts. Methodsfor the production of synthetic genes can be found in, for example, U.S.Pat. No. 5,380,831.

[0148] Treatment of cells. As mentioned above, Bacillus or recombinantcells expressing a Bacillus toxin can be treated to prolong the toxinactivity and stabilize the cell. The pesticide microcapsule that isformed comprises the Bacillus toxin within a cellular structure that hasbeen stabilized and will protect the toxin when the microcapsule isapplied to the environment of the target pest. Suitable host cells mayinclude either prokaryotes or eukaryotes. As hosts, of particularinterest will be the prokaryotes and the lower eukaryotes, such asfungi. The cell will usually be intact and be substantially in theproliferative form when treated, rather than in a spore form.

[0149] Treatment of the microbial cell, e.g., a microbe containing theBacillus toxin gene, can be by chemical or physical means, or by acombination of chemical and/or physical means, so long as the techniquedoes not deleteriously affect the properties of the toxin, nor diminishthe cellular capability of protecting the toxin. Methods for treatmentof microbial cells are disclosed in U.S. Pat. Nos. 4,695,455 and4,695,462, which are incorporated herein by reference.

[0150] Methods and formulations for control of pests. Control of pestsusing the isolates, toxins, and genes of the subject invention can beaccomplished by a variety of methods known to those skilled in the art.These methods include, for example, the application of Bacillus isolatesto the pests (or their location), the application of recombinantmicrobes to the pests (or their locations), and the transformation ofplants with genes which encode the pesticidal toxins of the subjectinvention. Transformations can be made by those skilled in the art usingstandard techniques. Materials necessary for these transformations aredisclosed herein or are otherwise readily available to the skilledartisan.

[0151] Formulated bait granules containing an attractant and the toxinsof the Bacillus isolates, or recombinant microbes comprising the genesobtainable from the Bacillus isolates disclosed herein, can be appliedto the soil. Formulated product can also be applied as a seed-coating orroot treatment or total plant treatment at later stages of the cropcycle. Plant and soil treatments of Bacillus cells may be employed aswettable powders, granules or dusts, by mixing with various inertmaterials, such as inorganic minerals (phyllosilicates, carbonates,sulfates, phosphates, and the like) or botanical materials (powderedcorncobs, rice hulls, walnut shells, and the like). The formulations mayinclude spreader-sticker adjuvants, stabilizing agents, other pesticidaladditives, or surfactants. Liquid formulations may be aqueous-based ornon-aqueous and employed as foams, gels, suspensions, emulsifiableconcentrates, or the like. The ingredients may include Theologicalagents, surfactants, emulsifiers, dispersants, or polymers.

[0152] As would be appreciated by a person skilled in the art, thepesticidal concentration will vary widely depending upon the nature ofthe particular formulation, particularly whether it is a concentrate orto be used directly. The pesticide will be present in at least 1% byweight and may be 100% by weight. The dry formulations will have fromabout 1-95% by weight of the pesticide while the liquid formulationswill generally be from about 1-60% by weight of the solids in the liquidphase. The formulations that contain cells will generally have fromabout 10² to about 10⁴ cells/mg. These formulations will be administeredat about 50 mg (liquid or dry) to 1 kg or more per hectare.

[0153] The formulations can be applied to the environment of the pest,e.g., soil and foliage, by spraying, dusting, sprinkling, or the like.

[0154] Polynucleotide probes. It is well known that DNA possesses afundamental property called base complementarity. In nature, DNAordinarily exists in the form of pairs of anti-parallel strands, thebases on each strand projecting from that strand toward the oppositestrand. The base adenine (A) on one strand will always be opposed to thebase thymine (T) on the other strand, and the base guanine (G) will beopposed to the base cytosine (C). The bases are held in apposition bytheir ability to hydrogen bond in this specific way. Though eachindividual bond is relatively weak, the net effect of many adjacenthydrogen bonded bases, together with base stacking effects, is a stablejoining of the two complementary strands. These bonds can be broken bytreatments such as high pH or high temperature, and these conditionsresult in the dissociation, or “denaturation,” of the two strands. Ifthe DNA is then placed in conditions which make hydrogen bonding of thebases thermodynamically favorable, the DNA strands will anneal, or“hybridize,” and reform the original double stranded DNA. If carried outunder appropriate conditions, this hybridization can be highly specific.That is, only strands with a high degree of base complementarity will beable to form stable double stranded structures. The relationship of thespecificity of hybridization to reaction conditions is well known. Thus,hybridization may be used to test whether two pieces of DNA arecomplementary in their base sequences. It is this hybridizationmechanism which facilitates the use of probes of the subject inventionto readily detect and characterize DNA sequences of interest.

[0155] The probes may be RNA, DNA, or PNA (peptide nucleic acid). Theprobe will normally have at least about 10 bases, more usually at leastabout 17 bases, and may have up to about 100 bases or more. Longerprobes can readily be utilized, and such probes can be, for example,several kilobases in length. The probe sequence is designed to be atleast substantially complementary to a portion of a gene encoding atoxin of interest. The probe need not have perfect complementarity tothe sequence to which it hybridizes. The probes may be labelledutilizing techniques which are well known to those skilled in this art.

[0156] One approach for the use of the subject invention as probesentails first identifying by Southern blot analysis of a gene bank ofthe Bacillus isolate all DNA segments homologous with the disclosednucleotide sequences. Thus, it is possible, without the aid ofbiological analysis, to know in advance the probable activity of manynew Bacillus isolates, and of the individual gene products expressed bya given Bacillus isolate. Such a probe analysis provides a rapid methodfor identifying potentially commercially valuable insecticidal toxingenes within the multifarious subspecies of B.t.

[0157] One hybridization procedure useful according to the subjectinvention typically includes the initial steps of isolating the DNAsample of interest and purifying it chemically. Either lysed bacteria ortotal fractionated nucleic acid isolated from bacteria can be used.Cells can be treated using known techniques to liberate their DNA(and/or RNA). The DNA sample can be cut into pieces with an appropriaterestriction enzyme. The pieces can be separated by size throughelectrophoresis in a gel, usually agarose or acrylamide. The pieces ofinterest can be transferred to an immobilizing membrane.

[0158] The particular hybridization technique is not essential to thesubject invention. As improvements are made in hybridization techniques,they can be readily applied.

[0159] The probe and sample can then be combined in a hybridizationbuffer solution and held at an appropriate temperature until annealingoccurs. Thereafter, the membrane is washed free of extraneous materials,leaving the sample and bound probe molecules typically detected andquantified by autoradiography and/or liquid scintillation counting. Asis well known in the art, if the probe molecule and nucleic acid samplehybridize by forming a strong non-covalent bond between the twomolecules, it can be reasonably assumed that the probe and sample areessentially identical. The probe's detectable label provides a means fordetermining in a known manner whether hybridization has occurred.

[0160] In the use of the nucleotide segments as probes, the particularprobe is labeled with any suitable label known to those skilled in theart, including radioactive and non-radioactive labels. Typicalradioactive labels include ³²P, ³⁵S, or the like. Non-radioactive labelsinclude, for example, ligands such as biotin or thyroxine, as well asenzymes such as hydrolases or perixodases, or the variouschemiluminescers such as luciferin, or fluorescent compounds likefluorescein and its derivatives. The probes may be made inherentlyfluorescent as described in International Application No. WO 93/16094.

[0161] Various degrees of stringency of hybridization can be employed.The more severe the conditions, the greater the complementarity that isrequired for duplex formation. Severity can be controlled bytemperature, probe concentration, probe length, ionic strength, time,and the like. Preferably, hybridization is conducted under moderate tohigh stringency conditions by techniques well known in the art, asdescribed, for example, in Keller, G. H., M. M. Manak (1987) DNA Probes,Stockton Press, New York, N.Y., pp. 169-170.

[0162] As used herein “moderate to high stringency” conditions forhybridization refers to conditions which achieve the same, or about thesame, degree of specificity of hybridization as the conditions employedby the current applicants. Examples of moderate and high stringencyconditions are provided herein. Specifically, hybridization ofimmobilized DNA on Southern blots with 32P-labeled gene-specific probeswas performed by standard methods (Maniatis et al.). In general,hybridization and subsequent washes were carried out under moderate tohigh stringency conditions that allowed for detection of targetsequences with homology to the exemplified toxin genes. Fordouble-stranded DNA gene probes, hybridization was carried out overnightat 20-25° C. below the melting temperature (Tm) of the DNA hybrid in6×SSPE, 5×Denhardt's solution, 0.1% SDS, 0.1 mg/ml denatured DNA. Themelting temperature is described by the following formula (Beltz, G. A.,K. A. Jacobs, T. H. Eickbush, P. T. Cherbas, and F. C. Kafatos [1983]Methods of Enzymology, R. Wu, L. Grossman and K. Moldave [eds.] AcademicPress, New York 100:266-285).

[0163] Tm=81.5° C.+16.6 Log[Na+]+0.41 (%G+C)−0.61(%formamide)−600/lengthof duplex in base pairs.

[0164] Washes are typically carried out as follows:

[0165] (1) Twice at room temperature for 15 minutes in 1×SSPE, 0.1% SDS(low stringency wash).

[0166] (2) Once at Tm-20° C. for 15 minutes in 0.2×SSPE, 0.1% SDS(moderate stringency wash).

[0167] For oligonucleotide probes, hybridization was carried outovernight at 10-20° C. below the melting temperature (Tm) of the hybridin 6×SSPE, 5×Denhardt's solution, 0.1% SDS, 0.1 mg/ml denatured DNA. Tmfor oligonucleotide probes was determined by the following formula:

[0168] Tm (° C.)=2(number T/A base pairs)+4(number G/C base pairs)(Suggs, S. V., T. Miyake, E. H. Kawashime, M. J. Johnson, K. Itakura,and R. B. Wallace [1981] ICN-UCLA Symp. Dev. Biol. Using Purified Genes,D. D. Brown [ed.], Academic Press, New York, 23:683-693).

[0169] Washes were typically carried out as follows:

[0170] (1) Twice at room temperature for 15 minutes 1×SSPE, 0.1% SDS(low stringency wash).

[0171] (2) Once at the hybridization temperature for 15 minutes in1×SSPE, 0.1% SDS (moderate stringency wash).

[0172] In general, salt and/or temperature can be altered to changestringency. With a labeled DNA fragment >70 or so bases in length, thefollowing conditions can be used:

[0173] Low: 1 or 2×SSPE, room temperature

[0174] Low: 1 or 2×SSPE, 42° C.

[0175] Moderate: 0.2× or 1×SSPE, 65° C.

[0176] High: 0.1×SSPE, 65° C.

[0177] Duplex formation and stability depend on substantialcomplementarity between the two strands of a hybrid, and, as notedabove, a certain degree of mismatch can be tolerated. Therefore, theprobe sequences of the subject invention include mutations (both singleand multiple), deletions, insertions of the described sequences, andcombinations thereof, wherein said mutations, insertions and deletionspermit formation of stable hybrids with the target polynucleotide ofinterest. Mutations, insertions, and deletions can be produced in agiven polynucleotide sequence in many ways, and these methods are knownto an ordinarily skilled artisan. Other methods may become known in thefuture.

[0178] Thus, mutational, insertional, and deletional variants of thedisclosed nucleotide sequences can be readily prepared by methods whichare well known to those skilled in the art. These variants can be usedin the same manner as the exemplified primer sequences so long as thevariants have substantial sequence homology with the original sequence.As used herein, substantial sequence homology refers to homology whichis sufficient to enable the variant probe to function in the samecapacity as the original probe. Preferably, this homology is greaterthan 50%; more preferably, this homology is greater than 75%; and mostpreferably, this homology is greater than 90%. The degree of homologyneeded for the variant to function in its intended capacity will dependupon the intended use of the sequence. It is well within the skill of aperson trained in this art to make mutational, insertional, anddeletional mutations which are designed to improve the function of thesequence or otherwise provide a methodological advantage.

[0179] PCR technology. Polymerase Chain Reaction (PCR) is a repetitive,enzymatic, primed synthesis of a nucleic acid sequence. This procedureis well known and commonly used by those skilled in this art (seeMullis, U.S. Pat. Nos. 4,683,195, 4,683,202, and 4,800,159; Saiki,Randall K., Stephen Scharf, Fred Faloona, Kary B. Mullis, Glenn T. Horn,Henry A. Erlich, Norman Arnheim [1985] “Enzymatic Amplification ofβ-Globin Genomic Sequences and Restriction Site Analysis for Diagnosisof Sickle Cell Anemia,” Science 230:1350-1354.). PCR is based on theenzymatic amplification of a DNA fragment of interest that is flanked bytwo oligonucleotide primers that hybridize to opposite strands of thetarget sequence. The primers are oriented with the 3′ ends pointingtowards each other. Repeated cycles of heat denaturation of thetemplate, annealing of the primers to their complementary sequences, andextension of the annealed primers with a DNA polymerase result in theamplification of the segment defined by the 5′ ends of the PCR primers.Since the extension product of each primer can serve as a template forthe other primer, each cycle essentially doubles the amount of DNAfragment produced in the previous cycle. This results in the exponentialaccumulation of the specific target fragment, up to several million-foldin a few hours. By using a thermostable DNA polymerase such as Taqpolymerase, which is isolated from the thermophilic bacterium Thermusaquaticus, the amplification process can be completely automated. Otherenzymes which can be used are known to those skilled in the art.

[0180] The DNA sequences of the subject invention can be used as primersfor PCR amplification. In performing PCR amplification, a certain degreeof mismatch can be tolerated between primer and template. Therefore,mutations, deletions, and insertions (especially additions ofnucleotides to the 5′ end) of the exemplified primers fall within thescope of the subject invention. Mutations, insertions and deletions canbe produced in a given primer by methods known to an ordinarily skilledartisan.

[0181] All of the U.S. patents cited herein are hereby incorporated byreference.

[0182] Following are examples which illustrate procedures for practicingthe invention. These examples should not be construed as limiting. Allpercentages are by weight and all solvent mixture proportions are byvolume unless otherwise noted.

EXAMPLE 1 Culturing of Bacillus Isolates Useful According to theInvention

[0183] Growth of cells. The cellular host containing the Bacillusinsecticidal gene may be grown in any convenient nutrient medium. Thesecells may then be harvested in accordance with conventional ways.Alternatively, the cells can be treated prior to harvesting.

[0184] The Bacillus cells of the invention can be cultured usingstandard art media and fermentation techniques. During the fermentationcycle, the bacteria can be harvested by first separating the Bacillusvegetative cells, spores, crystals, and lysed cellular debris from thefermentation broth by means well known in the art. Any Bacillus sporesor crystal δ-endotoxins formed can be recovered employing well-knowntechniques and used as a conventional δ-endotoxin B.t. preparation. Thesupernatant from the fermentation process contains toxins of the presentinvention. The toxins are isolated and purified employing well-knowntechniques.

[0185] A subculture of Bacillus isolates, or mutants thereof, can beused to inoculate the following medium, known as TB broth: Tryptone 12g/l Yeast Extract 24 g/l Glycerol 4 g/l KH₂PO₄ 2.1 g/l K₂HPO₄ 14.7 g/lpH 7.4

[0186] The potassium phosphate was added to the autoclaved broth aftercooling. Flasks were incubated at 30° C. on a rotary shaker at 250 rpmfor 24-36 hours.

[0187] The above procedure can be readily scaled up to large fermentorsby procedures well known in the art.

[0188] The Bacillus obtained in the above fermentation, can be isolatedby procedures well known in the art. A frequently-used procedure is tosubject the harvested fermentation broth to separation techniques, e.g.,centrifugation. In a specific embodiment, Bacillus proteins usefulaccording the present invention can be obtained from the supernatant.The culture supernatant containing the active protein(s) can be used inbioassays.

[0189] Alternatively, a subculture of Bacillus isolates, or mutantsthereof, can be used to inoculate the following peptone, glucose, saltsmedium: Bacto Peptone 7.5 g/l Glucose 1.0 g/l KH₂PO₄ 3.4 g/l K₂HPO₄ 4.35g/l Salt Solution 5.0 ml/l CaCl₂ Solution 5.0 ml/l pH 7.2 Salts Solution(100 ml) MgSO₄.7H₂O 2.46 g MnSO₄.H₂O 0.04 g ZnSO₄.7H₂O 0.28 g FeSO₄.7H₂O0.40 g CaCl₂ Solution (100 ml) CaCl₂.2H₂O 3.66 g

[0190] The salts solution and CaCl₂ solution are filter-sterilized andadded to the autoclaved and cooked broth at the time of inoculation.Flasks are incubated at 30° C. on a rotary shaker at 200 rpm for 64 hr.

[0191] The above procedure can be readily scaled up to large fermentorsby procedures well known in the art.

[0192] The Bacillus spores and/or crystals, obtained in the abovefermentation, can be isolated by procedures well known in the art. Afrequently-used procedure is to subject the harvested fermentation brothto separation techniques, e.g., centrifugation.

EXAMPLE 2 Isolation and Preparation of Cellular DNA for PCR

[0193] DNA can be prepared from cells grown on Spizizen's agar, or otherminimal or enriched agar known to those skilled in the art, forapproximately 16 hours. Spizizen's casamino acid agar comprises 23.2 g/lSpizizen's minimal salts [(NH₄)₂SO₄, 120 g; K₂HPO₄, 840 g; KH₂PO₄, 360g; sodium citrate, 60 g; MgSO₄·7H₂O, 12 g. Total: 1392 g]; 1.0 g/lvitamin-free casamino acids; 15.0 g/l Difco agar. In preparing the agar,the mixture was autoclaved for 30 minutes, then a sterile, 50% glucosesolution can be added to a final concentration of 0.5% (1/100 vol). Oncethe cells are grown for about 16 hours, an approximately 1 cm² patch ofcells can be scraped from the agar into 300 μl of 10 mM Tris-HCl (pH8.0)-1 mM EDTA. Proteinase K was added to 50 μg/ml and incubated at 55°C. for 15 minutes. Other suitable proteases lacking nuclease activitycan be used. The samples were then placed in a boiling water bath for 15minutes to inactivate the proteinase and denature the DNA. This alsoprecipitates unwanted components. The samples are then centrifuged at14,000×g in an Eppendorf microfuge at room temperature for 5 minutes toremove cellular debris. The supernatants containing crude DNA weretransferred to fresh tubes and frozen at −20° C. until used in PCRreactions.

[0194] Alternatively, total cellular DNA may be prepared fromplate-grown cells using the QIAamp Tissue Kit from Qiagen (SantaClarita, Calif.) following instructions from the manufacturer.

EXAMPLE 3 Use of PCR Primers to Characterize and/or Identify Toxin Genes

[0195] Two primers useful in PCR procedures were designed to identifygenes that encode pesticidal toxins. Preferably, these toxins are activeagainst lepidopteran insects. The DNA from 95 B.t. strains was subjectedto PCR using these primers. Two clearly distinguishable molecular weightbands were visible in “positive” strains, as outlined below. Thefrequency of strains yielding a 339 bp fragment was 29/95 (31%). Thisfragment is referred to herein as the “339 bp fragment” even though somesmall deviation in the exact number of base pairs may be observed.GARCCRTGGA AAGCAAATAA TAARAATGC (SEQ ID NO. 1) AAARTTATCT CCCCAWGCTTCATCTCCATT TTG (SEQ ID NO. 2)

[0196] The strains which were positive for the 339 bp fragment (29strains) were: PS11B, PS31G1, PS36A, PS49C, PS81A2, PS81F, PS81GG,PS81I, PS85A1, PS86BB1, PS86V1, PS86W1, PS89J3, PS91C2, PS94R1, PS101DD,PS158C2, PS185U2, PS192M4, PS202S, PS213E5, PS218G2, PS244A2, HD29,HD110, HD129, HD525, HD573a, and Javelin 1990.

[0197] The 24 strains which gave a larger (approximately 1.2 kb)fragment were: PS24J, PS33F2, PS45B1, PS52A1, PS62B1, PS80PP3, PS86A1,PS86Q3, PS88F16, PS92B, PS101Z2, PS123D1, PS157C1, PS169E, PS177F1,PS177G, PS185L2, PS201L1, PS204C3, PS204G4, PS242H10, PS242K17, PS244A2,PS244D1.

[0198] It was found that Bacillus strains producing lepidopteran-activeproteins yielded only the 339 bp fragment. Few, if any, of the strainsamplifying the approximately 1.2 kb fragment had known lepidopteranactivity, but rather were coleopteran-, mite-, and/or nematode-activeB.t. crystal protein producing strains.

EXAMPLE 4 DNA Sequencing of Toxin Genes Producing the 339 Fragment

[0199] PCR-amplified segments of toxin genes present in Bacillus strainscan be readily sequenced. To accomplish this, amplified DNA fragmentscan be first cloned into the PCR DNA TA-cloning plasmid vector, pCRII,as described by the supplier (Invitrogen, San Diego, Calif.). IndividualpCRII clones from the mixture of amplified DNA fragments from eachBacillus strain are chosen for sequencing. Colonies are lysed by boilingto release crude plasmid DNA. DNA templates for automated sequencing areamplified by PCR using vector-specific primers flanking the plasmidmultiple cloning sites. These DNA templates are sequenced using AppliedBiosystems (Foster City, Calif.) automated sequencing methodologies. Thepolypeptide sequences can be deduced from these nucleotide sequences.

[0200] DNA from three of the 29 B.t. strains which amplified the 339 bpfragments were sequenced. A DNA sequence encoding a toxin from strainPS36A is shown in SEQ ID NO. 3. An amino acid sequence for the 36A toxinis shown in SEQ ID. NO 4. A DNA sequence encoding a toxin from strainPS81F is shown in SEQ ID NO. 5. An amino acid sequence for the 81F toxinis shown in SEQ ID. NO 6. A DNA sequence encoding a toxin from strainJavelin 1990 is shown in SEQ ID NO. 7. An amino acid sequence for theJavelin 1990 toxin is shown in SEQ ID. NO 8.

EXAMPLE 5 Determination of DNA Sequences from Additional Genes EncodingToxins from Strains PS158C2 and PS49C

[0201] Genes encoding novel toxins were identified from isolates PS158C2and PS49C as follows: Total cellular DNA was extracted from B.t. strainsusing Qiagen (Santa Clarita, Calif.) Genomic-tip 500/G DNA extractionkits according to the supplier and was subjected to PCR using theoligonucleotide primer pairs listed below. Amplified DNA fragments werepurified on Qiagen PCR purification columns and were used as templatesfor sequencing.

[0202] For PS158C2, the primers used were as follows.

[0203] 158C2 PRIMER A:

[0204] GCTCTAGAAGGAGGTAACTTATGAACAAGAATAATACTAAATTAAGC (SEQ ID NO. 9)

[0205] 339 reverse:

[0206] AAARTTATCT CCCCAWGCTT CATCTCCATT TTG (SEQ ID NO. 2)

[0207] The resulting PCR-amplified DNA fragment was approximately 2 kbpin size. This DNA was partially sequenced by dideoxy chain terminationusing automated DNA sequencing technology (Pekin Elmer/AppliedBiosystems, Foster City, Calif.). A DNA sequence encoding a portion of asoluble toxin from PS158C2 is shown in SEQ ID NO. 10.

[0208] For PS49C, two separate DNA fragments encoding parts of a noveltoxin gene were amplified and sequenced. The first fragment wasamplified using the following primer pair:

[0209] 49C PRIMER A:

[0210] CATCCTCCCTACACTTTCTAA (SEQ ID NO. 11)

[0211] 339 reverse:

[0212] AAARTTATCT CCCCAWGCTT CATCTCCATT TTG (SEQ ID NO. 2)

[0213] The resulting approximately 1 kbp DNA fragment was used as atemplate for automated DNA sequencing. A sequence of a portion of atoxin gene from strain PS49C is shown in SEQ ID NO. 12.

[0214] The second fragment was amplified using the following primerpair:

[0215] 49C PRIMER B:

[0216] AAATTATGCGCTAAGTCTGC (SEQ ID NO. 13)

[0217] 49C PRIMER C:

[0218] TTGATCCGGACATAATAAT (SEQ ID NO. 14)

[0219] The resulting approximately 0.57 kbp DNA fragment was used as atemplate for automated DNA sequencing. An additional sequence of aportion of the toxin gene from PS49C is shown in SEQ ID NO. 15.

EXAMPLE 6 Additional Primers Useful for Characterizing and/orIdentifying Toxin Genes

[0220] The following primer pair can be used to identify and/orcharacterize genes of the SUP-1 family:

[0221] SUP-1A:

[0222] GGATTCGTTATCAGAAA (SEQ ID NO. 53)

[0223] SUP-1B:

[0224] CTGTYGCTAACAATGTC (SEQ ID NO. 54)

[0225] These primers can be used in PCR procedures to amplify a fragmenthaving a predicted size of approximately 370 bp. A band of the predictedsize was amplified from strains PS158C2 and PS49C.

EXAMPLE 7 Additional Primers Useful for Characterizing and/orIdentifying Toxin Genes

[0226] Another set of PCR primers can be used to identify and/orcharacterize additional genes encoding pesticidal toxins. The sequencesof these primers were as follows:

[0227] GGRTTAMTTGGRTAYTATTT (SEQ ID NO. 16)

[0228] ATATCKWAYATTKGCATTTA (SEQ ID NO. 17)

[0229] Redundant nucleotide codes used throughout the subject disclosureare in accordance with the IUPAC convention and include:

[0230] R=A or G

[0231] M=A or C

[0232] Y=C or T

[0233] K=G or T

[0234] W=A or T

EXAMPLE 8 Identification and Sequencing of Genes Encoding Novel SolubleProtein Toxins from Bacillus Strains

[0235] PCR using primers SEQ ID NO. 16 and SEQ ID NO. 17 was performedon total cellular genomic DNA isolated from a broad range of Bt strains.Those samples yielding an approximately 1 kb band were selected forcharacterization by DNA sequencing. Amplified DNA fragments were firstcloned into the PCR DNA TA-cloning plasmid vector, pCR2.1, as describedby the supplier (Invitrogen, San Diego, Calif.). Plasmids were isolatedfrom recombinant clones and tested for the presence of an approximately1 kbp insert by PCR using the plasmid vector primers, T3 and T7.

[0236] The following strains yielded the expected band of approximately1000 bp, thus indicating the presence of a MIS-type toxin gene: PS10E1,PS31J2, PS33D2, PS66D3, PS68F, PS69AA2, PS168G1, PS177C8, PS177I8,PS185AA2, PS196F3, PS196J4, PS197T1, PS197U2, PS202E1, KB33, KB38,PS33F1, PS157C1 (157C1-A), PS201Z, PS31F2, and PS185Y2.

[0237] Plasmids were then isolated for use as sequencing templates usingQIAGEN (Santa Clarita, Calif.) miniprep kits as described by thesupplier. Sequencing reactions were performed using the Dye TerminatorCycle Sequencing Ready Reaction Kit from PE Applied Biosystems.Sequencing reactions were run on a ABI PRISM 377 Automated Sequencer.Sequence data was collected, edited, and assembled using the ABI PRISM377 Collection, Factura, and AutoAssembler software from PE ABI.

[0238] DNA sequences were determined for portions of novel toxin genesfrom the following isolates: PS10E1, PS31J2, PS33D2, PS66D3, PS68F,PS69AA2, PS168G1, PS177C8, PS177I8, PS185AA2, PS196F3, PS196J4, PS197T1,PS197U2, PS202E1, KB33, KB38, PS33F1, PS157C1 (157C1-A), PS201Z, PS31F2,and PS185Y2. Polypeptide sequences were deduced for portions of theencoded, novel soluble toxins from the following isolates: PS10E1,PS31J2, PS33D2, PS66D3, PS68F, PS69AA2, PS177C8, PS177I8, PS185AA2,PS196F3, PS196J4, PS197T1, PS197U2, PS202E1, and PS157C1 (toxin157C1-A). These nucleotide sequences and amino acid sequences are shownin SEQ ID NOS. 18 to 48 and SEQ ID NOS. 139-144.

EXAMPLE 9 Restriction Fragment Length Polymorphism (RFLP) of Toxins fromBacillus thuringiensis Strains

[0239] Total cellular DNA was prepared from various Bacillusthuriengensis (B.t.) strains grown to an optical density of 0.5-0.8 at600 nm visible light. DNA was extracted using the Qiagen Genomic-tip500/G kit and Genomic DNA Buffer Set according to protocol for Grampositive bacteria (Qiagen Inc.; Valencia, Calif.).

[0240] Standard Southern hybridizations using ³²P-lableled probes wereused to identify and characterize novel toxin genes within the totalgenomic DNA preparations. Prepared total genomic DNA was digested withvarious restriction enzymes, electrophoresed on a 1% agarose gel, andimmobilized on a supported nylon membrane using standard methods(Maniatis et al.).

[0241] PCR-amplified DNA fragments 1.0-1.1 kb in length were gelpurified for use as probes. Approximately 25 ng of each DNA fragment wasused as a template for priming nascent DNA synthesis using DNApolymerase I Klenow fragment (New England Biolabs), randomhexanucleotide primers (Boehringer Mannheim) and ³²PdCTP.

[0242] Each ³²P-lableled fragment served as a specific probe to itscorresponding genomic DNA blot. Hybridizations of immobilized DNA withrandomly labeled ³²p probes were performed in standard aqueous bufferconsisting of 5×SSPE, 5×Denhardt's solution, 0.5% SDS, 0.1 mg/ml at 65°C. overnight. Blots were washed under moderate stringency in 0.2×SSC,0.1% SDS at 65° C. and exposed to film. RFLP data showing specifichybridization bands containing all or part of the novel gene of interestwas obtained for each strain. TABLE 4 (Strain)/ Probe Seq I.D. Gene NameNumber RFLP Data (approximate band sizes) (PS)10E1 18 EcoRI: 4 and 9kbp, EcoRV: 4.5 and 6 kbp, KpnI: 12 and 24 kbp, SacI: 13 and 24 kbp,SalI: >23 kbp, XbaI: 5 and 15 kbp (PS)31J2 20 ApaI: >23 kbp, BgIII: 6.5kbp, PstI: >23 kbp, SacI: >23 kbp, SalI: >23 kbp, XbaI: 5 kbp (PS)33D222 EcoRI: 10 kbp, EcoRV: 15 kbp, HindIII: 18 kbp, KpnI: 9.5 kbp, PstI: 8kbp (PS)66D3 24 BamHI: 4.5 kbp, HindIII: >23 kbp, KpnI: 23 kbp, PstI: 15kbp, XbaI: >23 kbp (PS)68F 26 EcoRI: 8.5 and 15 kbp, EcoRV: 7 and 18kbp, HindIII: 2.1 and 9.5 kbp, PstI: 3 and 18 kbp, XbaI: 10 and 15 kbp(PS)69AA2 28 EcoRV: 9.5 kbp, HindIII: 18 kbp, KpnI: 23 kbp, NheI: >23kbp, PstI: 10 kbp, SalI: >23 kbp (PS)168G1 30 EcoRI: 10 kbp, EcoRV: 3.5kbp, NheI: 20 kbp, PstI: 20 kbp, SalI: >23 kbp, XbaI: 15 kbp (PS)177I833 BamHI: >23 kbp, EcoRI: 10 kbp, HindIII: 2 kbp, SalI:>23 kbp, XbaI:3.5 kbp (PS)185AA2 35 EcoRI: 7 kbp, EcoRV: 10 kbp (&3.5 kbp?), NheI: 4kbp, PstI: 3 kbp, SalI: >23 kbp, XbaI: 4 kbp (PS)196F3 37 EcoRI: 8 kbp,EcoRV: 9 kbp, NheI: 18 kbp, PstI: 18 kbp, SalI: 20 kbp, XbaI: 7 kbp(PS)196J4 39 BamHI: >23 kbp, EcoRI: 3.5 and 4.5 kbp, PstI: 9 and 24 kbp,SalI: >23 kbp, XbaI: 2.4 kbp and 12 kbp (PS)197T1 41 HindIII: 10 kbp,KpnI: 20 kbp, PstI: 20 kbp, SacI: 20 kbp, SpeI: 15 kbp, XbaI: 5 kbp(PS)197U2 43 EcoRI: 5 kbp, EcoRV: 1.9 kbp, NheI: 20 kbp, PstI: 23 kbp,SalI: >23 kbp, XbaI: 7 kbp (PS)202E1 45 EcoRV: 7 kbp, KpnI: 12 kbp,NheI: 10 kbp, PstI: 15 kbp, SalI: 23 kbp, XbaI: 1.8 kbp KB33 47 EcoRI: 9kbp, EcoRV: 6 kbp, HindIII: 8 kbp, KpnI: >23 kbp, NheI: 22 kbp,SalI: >23 kbp KB38 48 BamHI: 5.5 kbp, EcoRV: 22 kbp, HindIII: 2.2 kbp,NheI: 20 kbp PstI: >23 kbp

[0243] In separate experiments, alternative probes for MIS and WAR geneswere used to detect novel toxin genes on Southern blots of genomic DNAby ³²P autoradiography or by non-radioactive methods using the DIGnucleic acid labeling and detection system (Boehringer Mannheim;Indianapolis, Ind.). DNA fragments approximately 2.6 kbp (PS177C8 MIStoxin gene; SEQ ID NO. 31) and 1.3 kbp (PS177C8 WAR toxin gene; SEQ IDNO. 51) in length were PCR amplified from plasmid pMYC2450 and used asthe probes for all strains listed. Fragments were gel purified andapproximately 25 ng of each DNA fragment was randomly labeled with ³²Pfor radioactive detection or approximately 300 ng of each DNA fragmentwas randomly labeled with the DIG High Prime kit for nonradioactivedetection. Hybridization of immobilized DNA with randomly labeled ³²Pprobes were performed in standard formamide conditions: 50% formamide,5×SSPE, 5×Denhardt's solution, 2% SDS, 0.1 mg/ml sonicated sperm DNA at42° C. overnight. Blots were washed under low stringency in 2×SSC, 0.1%SDS at 42° C. and exposed to film. RFLP data showing DNA bandscontaining all or part of the novel gene of interest was obtained foreach strain.

[0244] RFLP data using Probe 177C8-MIS (SEQ ID NO. 31) were as follows:TABLE 5 RFLP RFLP Data (approximate band size in Class Strain Name(s)base pairs) A 177C8, 74H3, 66D3 HindIII: 2,454; 1,645 XbaI: 14,820;9,612; 8,138; 5,642; 1,440 B 177I8 HindIII: 2,454 XbaI: 3,500 (veryfaint 7,000) C 66D3 HindIII: 2,454 (faint 20,000) XbaI: 3,500 (faint7,000) D 28M, 31F2, 71G5, HindIII: 11,738; 7,614 71G7, 71I1, 71N1, XbaI:10,622; 6,030 146F, 185Y2, 201JJ7, KB73, KB68B46-2, KB71A35-4,KB71A116-1 D₁ 70B2, 71C2 HindIII: 11,738; 8,698; 7,614 XbaI: 11,354;10,622; 6,030 E KB68B51-2, KB68B55-2 HindIII: 6,975; 2,527 XbaI: 10,000;6,144 F KB53A49-4 HindIII: 5,766 XbaI: 6,757 G 86D1 HindIII: 4,920 XbaI:11,961 H HD573B, 33F1, 67B3 HindIII: 6,558; 1,978 XbaI: 7,815; 6,558 I205C, 40C1 HindIII: 6,752 XbaI: 4,618 J 130A3, 143A2, 157C1 HindIII:9,639; 3,943, 1,954; 1,210 XbaI: 7,005; 6,165; 4,480; 3,699 K 201ZHindIII: 9,639; 4,339 XbaI: 7,232; 6,365 L 71G4 HindIII: 7,005 XbaI:9.639 M KB42A33-8, HindIII: 3,721 KB71A72-1, XbaI: 3,274 KB71A133-11 NKB71A134-2 HindIII: 7,523 XbaI: 10,360; 3,490 O KB69A125-3, HindIII:6,360; 3,726; 1,874; 1,098 KB69A127-7, XbaI: 6,360; 5,893; 5,058; 3,726KB69A136-2, KB71A20-4

[0245] RFLP data using Probe 177C8-WAR (SEQ ID NO. 51) were as follows:TABLE 6 RFLP RFLP Data (approximate band Class Strain Name(s) size inbase pairs) A 177C8, 74H3 HindIII: 3,659, 2,454, 606 XbaI: 5,457, 4,469,1,440, 966 B 17718, 66D3 data unavailable C 28M, 31F2, 71G5, 71G7, 71I1,HindIII: 7,614 71N1, 146F, 185Y2, 201JJ7, XbaI: 10,982, 6,235 KB73,KB68B46-2, KB71A35-4, KB71A116-1 C₁ 70B2, 71C2 HindIII: 8,698, 7,614XbaI: 11,354, 6,235 D KB68B51-2, KB68B55-2 HindIII: 7,200 Xbal: 6,342(and 11,225 for 51-2)(and 9,888 for 55-2) E KB53A49-4 HindIII: 5,766XbaI: 6,757 F HD573B, 33F1, 67B3 HindIII: 3,348, 2,037 (and 6,558 forHDS73B only) XbaI: 6,953 (and7,815, 6,185 for HD573B only) G 205C, 40C1HindIII: 3,158 XbaI: 6,558, 2,809 H 130A3, 143A2, 157C1 HindIII: 4,339,3,361, 1,954, 660, 349 XbaI: 9.043, 4,203, 3,583, 2,958, 581, 464 I 201ZHindIII: 4,480, 3,819, 703 XbaI: 9,336, 3,256, 495 I 71G4 HindIII: 7,005XbaI: 9,639 K KB42A33-8, K1B71A72-1, no hybridization signal KB71A133-11L KB71A134-2 HindIII: 7,523 XbaI: 10,360 M KB69A125-3, KB69A127-7,HindIII: 5,058; 3,726; 3,198; KB69A136-2, 2,745; 257 KB71A2O-4 XbaI:5,255; 4,341; 3,452; 1,490; 474

EXAMPLE 10 Use of Additional PCR Primers for Characterizing and/orIdentifying Novel Genes

[0246] Another set of PCR primers can be used to identify additionalnovel genes encoding pesticidal toxins. The sequences of these primerswere as follows:

[0247] ICON-forward:

[0248] CTTGAYTTTAAARATGATRTA (SEQ ID NO. 49)

[0249] ICON-reverse:

[0250] AATRGCSWATAAATAMGCACC (SEQ ID NO. 50)

[0251] These primers can be used in PCR procedures to amplify a fragmenthaving a predicted size of about 450 bp.

[0252] Strains PS177C8, PS177I8, and PS66D3 were screened and were foundto have genes amplifiable with these ICON primers. A sequence of a toxingene from PS177C8 is shown in SEQ ID NO. 51. An amino acid sequence ofthe 177C8-ICON toxin is shown in SEQ ID NO. 52.

EXAMPLE 11

[0253] Use of Mixed Primer Pairs to Characterize and/or Identify ToxinGenes

[0254] Various combinations of the primers described herein can be usedto identify and/or characterize toxin genes. PCR conditions can be usedas indicated below: SEQ ID NO. SEQ. ID NO. SEQ ID NO. 16/17 49/50 49/17Pre-denature 94° C. 1 min. 94° C. 1 min. 94° C. 1 min. Program 94° C. 1min. 94° C. 1 min. 94° C. 1 min. Cycle 42° C. 2 min. 42° C. 2 min. 42°C. 2 min. 72° C. 3 min. + 72° C. 3 min. + 72° C. 3 min. + 5 sec/cycl 5sec/cycl 5 sec/cycl Repeat cycle 29 Repeat cycle Repeat cycle times 29times 29 times Hold 4° C. Hold 4° C. Hold 4° C.

[0255] Using the above protocol, a strain harboring a MIS-type of toxinwould be expected to yield a 1000 bp fragment with the SEQ ID NO. 16/17primer pair. A strain harboring a WAR-type of toxin would be expected toamplify a fragment of about 475 bp with the SEQ ID NO. 49/50 primerpair, or a fragment of about 1800 bp with the SEQ ID NO. 49/17 primerpair. The amplified fragments of the expected size were found in fourstrains. The results are reported in Table 7. TABLE 7 ApproximateAmplified Fragment Sizes (bp) SEQ ID NO. SEQ ID NO. Strain 16/17 SEQ IDNO. 49/50 49/17 PS66D3 1000 900, 475 1800 PS177C8 1000 475 1800 PS177181000 900, 550, 475 1800 PS217U2 1000 2500, 1500, 900, 475 no banddetected

EXAMPLE 12 Characterization and/or Identification of WAR Toxins

[0256] In a further embodiment of the subject invention, pesticidaltoxins can be characterized and/or identified by their level ofreactivity with antibodies to pesticidal toxins exemplified herein. In aspecific embodiment, antibodies can be raised to WAR toxins such as thetoxin obtainable from PS177C8a. Other WAR toxins can then be identifiedand/or characterized by their reactivity with the antibodies. In apreferred embodiment, the antibodies are polyclonal antibodies. In thisexample, toxins with the greatest similarity to the 177C8a-WAR toxinwould have the greatest reactivity with the polyclonal antibodies. WARtoxins with greater diversity react with the 177C8a polyclonalantibodies, but to a lesser extent. Toxins which immunoreact withpolyclonal antibodies raised to the 177C8a WAR toxin can be obtainedfrom, for example, the isolates designated PS177C8a, PS177I8, PS66D3,KB68B55-2, PS185Y2, PS146F, KB53A49-4, PS175I4, KB68B51-2, PS28K1,PS31F2, KB58B46-2, PS146D, PS74H3, PS28M, PS71G6, PS71G7, PS71I1,PS71N1, PS201JJ7, KB73, KB68B46-2, KB71A35-4, KB71A116-1, PS70B2,PS71C2, PS86D1, HD573B, PS33F1, PS67B3, PS205C, PS40C1, PS130A3,PS143A2, PS157C1, PS201Z, PS71G4, KB42A33-8, KB71A72-1, KB71A133-11,KB71A134-2, KB69A125-3, KB69A127-7, KB69A136-2, and KB71A20-4. Suchdiverse WAR toxins can be further characterized by, for example, whetheror not their genes can be amplified with ICON primers. For example, thefollowing isolates do not have polynucleotide sequences which areamplified by ICON primers: PS177C8a, PS177I8, PS66D3, KB68B55-2,PS185Y2, PS146F, KB53A49-4, PS175I4, KB68B51-2, PS28K1, PS31F2,KB58B46-2, and PS146D. Of these, isolates PS28K1, PS31F2, KB68B46-2, andPS146D show the weakest antibody reactivity, suggesting advantageousdiversity.

EXAMPLE 13 Bioassays for Activity Against Lepidopterans and Coleopterans

[0257] Biological activity of the toxins and isolates of the subjectinvention can be confirmed using standard bioassay procedures. One suchassay is the budworm-bollworm (Heliothis virescens [Fabricius] andHelicoverpa zea [Boddie]) assay. Lepidoptera bioassays were conductedwith either surface application to artificial insect diet or dietincorporation of samples. All Lepidopteran insects were tested from theneonate stage to the second instar. All assays were conducted witheither toasted soy flour artificial diet or black cutworm artificialdiet (BioServ, Frenchtown, N.J.).

[0258] Diet incorporation can be conducted by mixing the samples withartificial diet at a rate of 6 mL suspension plus 54 mL diet. Aftervortexing, this mixture is poured into plastic trays withcompartmentalized 3-ml wells (Nutrend Container Corporation,Jacksonville, Fla.). A water blank containing no B.t. serves as thecontrol. First instar larvae (USDA-ARS, Stoneville, Miss.) are placedonto the diet mixture. Wells are then sealed with Mylar sheeting(ClearLam Packaging, Illinois) using a tacking iron, and severalpinholes are made in each well to provide gas exchange. Larvae were heldat 25° C. for 6 days in a 14:10 (light:dark) holding room. Mortality andstunting are recorded after six days.

[0259] Bioassay by the top load method utilizes the same sample and dietpreparations as listed above. The samples are applied to the surface ofthe insect diet. In a specific embodiment, surface area ranged from 0.3to approximately 0.8 cm² depending on the tray size, 96 well tissueculture plates were used in addition to the format listed above.Following application, samples are allowed to air dry before insectinfestation. A water blank containing no B.t. can serve as the control.Eggs are applied to each treated well and were then sealed with Mylarsheeting (ClearLam Packaging, Illinois) using a tacking iron, andpinholes are made in each well to provide gas exchange. Bioassays areheld at 25° C. for 7 days in a 14:10 (light:dark) or 28° C. for 4 daysin a 14:10 (light:dark) holding room. Mortality and insect stunting arerecorded at the end of each bioassay.

[0260] Another assay useful according to the subject invention is theWestern corn rootworm assay. Samples can be bioassayed against neonatewestern corn rootworm larvae (Diabrotica virgifera virgifera) viatop-loading of sample onto an agar-based artificial diet at a rate of160 ml/cm². Artificial diet can be dispensed into 0.78 cm² wells in48-well tissue culture or similar plates and allowed to harden. Afterthe diet solidifies, samples are dispensed by pipette onto the dietsurface. Excess liquid is then evaporated from the surface prior totransferring approximately three neonate larvae per well onto the dietsurface by camel's hair brush. To prevent insect escape while allowinggas exchange, wells are heat-sealed with 2-mil punched polyester filmwith 27HT adhesive (Oliver Products Company, Grand Rapids, Mich.).Bioassays are held in darkness at 25 ° C., and mortality scored afterfour days.

[0261] Analogous bioassays can be performed by those skilled in the artto assess activity against other pests, such as the black cutworm(Agrotis ipsilon).

[0262] Results are shown in Table 8. TABLE 8 Genetics and function ofconcentrated B.t. supernatants screened for lepidopteran and coleopteranactivity Approx. ca. 80-100 339 bp PCR Total Protein kDa protein H.virescens H. Zen Diabrotica Strain fragment (μg/cm²) (μg/cm²) %mortality Stunting % mortality Stunting % mortality PS31G1 + 8.3 2.1 70yes 39 yes NT PS49C + 13.6 1.5 8 yes 8 no NT PS80JJ1 — 8.0 NT 18 no 13no NT PS80JJ1 (#2) — 35 NT — — — — 43 PS81A2 (#1) + 30.3 2.3 100 yes 38yes NT PS81A2 (#2) + 18.8 1.6 38 yes 13 no NT PS81F ++ 26 5.2 100 yes 92yes NT PS81I + 10.7 1.7 48 yes 13 no NT PS86B1 (#1) — 23.2 4.5 17 no 13no — PS86B1 (#2) — 90 17.5 — — — — 35 PS86B1 (#3) — 35 6.8 — — — — 10PS122D3 (#1) — 33.2 1.8 21 no 21 no — PS122D3 (#2) — 124 6.7 — — — — 45PS122D3 (#3) — 35 1.9 — — — — 16 PS123D1 (#1) — 10.7 NT 0 no 0 no —PS123D1 (#2) — 69 NT — — — — 54 PS123D1 (#3) — 35 NT — — — — 21 PS123D1(#4) — 17.8 NT 5 no 4 no NT PS149B1 (#1) NT 9 NT 0 no 0 yes NT PS149B1(#2) NT 35 NT — — — — 50 PS157C1 (#1) — 24 2 43 yes 13 yes — PS157C1(#2) — 93 8 — — — — 40 PS157C1 (#3) — 35 3 — — — — 18 PS185L2 (#1) — 2NT 8 no 0 no NT PS185L2 (#2) — 3 NT 10 no 25 no NT PS185U2 + 23.4 2.9100 yes 100 yes NT PS192M4 + 10.7 2.0 9 no 4 yes NT HD129 + 44.4 4.9 100yes 50 yes NT Javelin 1990 ++ 43.2 3.6 100 yes 96 yes NT water 0-8 — 0-4— 12

EXAMPLE 14 Results of Western Corn Rootworm Bioassays and FurtherCharacterization of the Toxins

[0263] Concentrated liquid supernatant solutions, obtained according tothe subject invention, were tested for activity against Western cornrootworm (WCRW). Supernatants from the following isolates were found tocause mortality against WCRW: PS10E1, PS31F2, PS31J2, PS33D2, PS66D3,PS68F, PS80JJ1, PS146D, PS175I4, PS177I8, PS196J4, PS197T1, PS197U2,KB33, KB53A49-4, KB68B46-2, KB68B51-2, KB68B55-2, PS177C8, PS69AA2,KB38, PS196F3, PS168G1, PS202E1, PS217U2 and PS185AA2.

[0264] Supernatants from the following isolates were also found to causemortality against WCRW: PS205A3, PS185V2, PS234E1, PS71G4, PS248N10,PS191A21, KB63B19-13, KB63B19-7, KB68B62-7, KB68B63-2, KB69A125-1,KB69A125-3, KB69A125-5, KB69A127-7, KB69A132-1, KB69B2-1, KB70B5-3,KB71A125-15, and KB71A35-6; it was confirmed that this activity was heatlabile. Furthermore, it was determined that the supernatants of thefollowing isolates did not react (yielded negative test results) withthe WAR antibody (see Example 12), and did not react with the MIS (SEQID NO. 31) and WAR (SEQ ID NO. 51) probes: PS205A3, PS185V2, PS234E1,PS71G4, PS248N10, PS191A21, KB63B19-13, KB63B19-7, KB68B62-7, KB68B63-2,KB69A125-1, KB69A125-5, KB69A132-1, KB69B2-1, KB70B5-3, KB71A125-15, andKB71A35-6; the supernatants of isolates KB69A125-3 and KB69A127-7yielded positive test results.

EXAMPLE 15 Results of Budworm/Bollworm Bioassays

[0265] Concentrated liquid supernatant solutions, obtained according tothe subject invention, were tested for activity against Heliothisvirescens (H.v.) and Helicoverpa zea (H.z.). Supernatants from thefollowing isolates were tested and were found to cause mortality againstH.v.: PS157C1, PS31G1, PS49C, PS81F, PS81I, Javelin 1990, PS158C2,PS202S, PS36A, HD110, and HD29. Supernatants from the following isolateswere tested are were found to cause significant mortality against H.z.:PS31G1, PS49C, PS81F, PS81I, PS157C1, PS158C2, PS36A, HD110, and Javelin1990.

EXAMPLE 16 Target Pests

[0266] Toxins of the subject invention can be used, alone or incombination with other toxins, to control one or more non-mammalianpests. These pests may be, for example, those listed in Table 9.Activity can readily be confirmed using the bioassays provided herein,adaptations of these bioassays, and/or other bioassays well known tothose skilled in the art. TABLE 9 Target pest species ORDER/Common NameLatin Name LEPIDOPTERA European Corn Borer Ostrinia nubilalis EuropeanCorn Borer resistant to Cryl A Ostrinia nubilalis Black Cutworm Agrotisipsilon Fall Armyworm Spodoptera frugiperda Southwestern Corn BorerDiatraea grandiosella Corn Earworm/Bollworm Helicoverpa zea TobaccoBudworm Heliothis virescens Tobacco Budworm Rs Heliothis virescensSunflower Head Moth Homeosoma ellectellum Banded Sunflower Moth Cochylishospes Argentine Looper Rachiplusia nu Spilosoma Spilosoma virginicaBertha Armyworm Mamestra configurata Diamondback Moth Plutellaxylostells COLEOPTERA Red Sunflower Seed Weevil Smicronyx fulvusSunflower Stem Weevil Cylindrocopturus adspersus Sunflower BeetleZygoramma exclamationis Canola Flea Beetle Phyllotreta cruciferaeWestern Corn Rootworm Diabrotica virgifera virgifera DIPTERA Hessian FlyMayetiola destructor HOMOPTERA Greenbug Schizaphis graminum HEMIPTERALygus Bug Lygus lineolaris NEMATODA Heterodera glycines

EXAMPLE 17 Insertion of Toxin Genes Into Plants

[0267] One aspect of the subject invention is the transformation ofplants with genes encoding the insecticidal toxin of the presentinvention. The transformed plants are resistant to attack by the targetpest.

[0268] Genes encoding pesticidal toxins, as disclosed herein, can beinserted into plant cells using a variety of techniques which are wellknown in the art. For example, a large number of cloning vectorscomprising a replication system in E. coli and a marker that permitsselection of the transformed cells are available for preparation for theinsertion of foreign genes into higher plants. The vectors comprise, forexample, pBR322, pUC series, M13mp series, pACYC184, etc. Accordingly,the sequence encoding the Bacillus toxin can be inserted into the vectorat a suitable restriction site. The resulting plasmid is used fortransformation into E. coli. The E. coli cells are cultivated in asuitable nutrient medium, then harvested and lysed. The plasmid isrecovered. Sequence analysis, restriction analysis, electrophoresis, andother biochemical-molecular biological methods are generally carried outas methods of analysis. After each manipulation, the DNA sequence usedcan be cleaved and joined to the next DNA sequence. Each plasmidsequence can be cloned in the same or other plasmids. Depending on themethod of inserting desired genes into the plant, other DNA sequencesmay be necessary. If, for example, the Ti or Ri plasmid is used for thetransformation of the plant cell, then at least the right border, butoften the right and the left border of the Ti or Ri plasmid T-DNA, hasto be joined as the flanking region of the genes to be inserted.

[0269] The use of T-DNA for the transformation of plant cells has beenintensively researched and sufficiently described in EP 120 516; Hoekema(1985) In: The Binary Plant Vector System, Offset-durkkerij Kanters B.V., Alblasserdam, Chapter 5; Fraley et al., Crit. Rev. Plant Sci.4:1-46; and An et al. (1985) EMBO J. 4:277-287.

[0270] Once the inserted DNA has been integrated in the genome, it isrelatively stable there and, as a rule, does not come out again. Itnormally contains a selection marker that confers on the transformedplant cells resistance to a biocide or an antibiotic, such as kanamycin,G 418, bleomycin, hygromycin, or chloramphenicol, inter alia. Theindividually employed marker should accordingly permit the selection oftransformed cells rather than cells that do not contain the insertedDNA.

[0271] A large number of techniques are available for inserting DNA intoa plant host cell. Those techniques include transformation with T-DNAusing Agrobacterium tumefaciens or Agrobacterium rhizogenes astransformation agent, fusion, injection, biolistics (microparticlebombardment), or electroporation as well as other possible methods. IfAgrobacteria are used for the transformation, the DNA to be inserted hasto be cloned into special plasmids, namely either into an intermediatevector or into a binary vector. The intermediate vectors can beintegrated into the Ti or Ri plasmid by homologous recombination owingto sequences that are homologous to sequences in the T-DNA. The Ti or Riplasmid also comprises the vir region necessary for the transfer of theT-DNA. Intermediate vectors cannot replicate themselves in Agrobacteria.The intermediate vector can be transferred into Agrobacteriumtumefaciens by means of a helper plasmid (conjugation). Binary vectorscan replicate themselves both in E. coli and in Agrobacteria. Theycomprise a selection marker gene and a linker or polylinker which areframed by the right and left T-DNA border regions. They can betransformed directly into Agrobacteria (Holsters et al. [1978] Mol. Gen.Genet. 163:181-187). The Agrobacterium used as host cell is to comprisea plasmid carrying a vir region. The vir region is necessary for thetransfer of the T-DNA into the plant cell. Additional T-DNA may becontained. The bacterium so transformed is used for the transformationof plant cells. Plant explants can advantageously be cultivated withAgrobacterium tumefaciens or Agrobacterium rhizogenes for the transferof the DNA into the plant cell. Whole plants can then be regeneratedfrom the infected plant material (for example, pieces of leaf, segmentsof stalk, roots, but also protoplasts or suspension-cultivated cells) ina suitable medium, which may contain antibiotics or biocides forselection. The plants so obtained can then be tested for the presence ofthe inserted DNA. No special demands are made of the plasmids in thecase of injection and electroporation. It is possible to use ordinaryplasmids, such as, for example, pUC derivatives. In biolistictransformation, plasmid DNA or linear DNA can be employed.

[0272] The transformed cells are regenerated into morphologically normalplants in the usual manner. If a transformation event involves a germline cell, then the inserted DNA and corresponding phenotypic trait(s)will be transmitted to progeny plants. Such plants can be grown in thenormal manner and crossed with plants that have the same transformedhereditary factors or other hereditary factors. The resulting hybridindividuals have the corresponding phenotypic properties.

[0273] In a preferred embodiment of the subject invention, plants willbe transformed with genes wherein the codon usage has been optimized forplants. See, for example, U.S. Pat. No.5,380,831. Also, advantageously,plants encoding a truncated toxin will be used. The truncated toxintypically will encode about 55% to about 80% of the full length toxin.Methods for creating synthetic Bacillus genes for use in plants areknown in the art.

[0274] It should be understood that the examples and embodimentsdescribed herein are for illustrative purposes only and that variousmodifications or changes in light thereof will be suggested to personsskilled in the art and are to be included within the spirit and purviewof this application.

1 144 29 base pairs nucleic acid single linear DNA (genomic) 1GARCCRTGGA AAGCAAATAA TAARAATGC 29 33 base pairs nucleic acid singlelinear DNA (genomic) 2 AAARTTATCT CCCCAWGCTT CATCTCCATT TTG 33 2375 basepairs nucleic acid single linear DNA (genomic) 36a 3 ATGAACAAGAATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60 AATGGCATTTATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 20 GATACAGGTGGTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 80 ATTTCTGGTAAATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240 TTAAATACAGAATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 300 AATGATGTTAATAACAAACT CGATGCGATA AATACGATGC TTCGGGTATA TCTACCTAAA 360 ATTACCTCTATGTTGAGTGA TGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATAGAA 420 TACTTAAGTAAACAATTGCA AGAGATTTCT GATAAGTTGG ATATTATTAA TGTAAATGTA 480 CTTATTAACTCTACACTTAC TGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 540 GAAAAATTTGAGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 600 TCTCCTGCAAATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 660 AAAAATGATGTGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720 AATAATTTATTCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 780 GTGAAAACAAGTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 840 CTGCAAGCAAAAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900 ATTGATTATACTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960 AACATCCTCCCTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020 AGTGATGAAGATGCAAAGAT GATTGTGGAA GCTAAACCAG GACATGCATT GATTGGGTTT 1080 GAAATTAGTAATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 1140 TATCAAGTCGATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 1200 TGCCCAGATCAATCTGAACA AATCTATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260 GTAATTACTAAAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 1320 AATTTTTATGATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 1380 GAAGCGGAGTATAAAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 1440 ATCAGTGAAACATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 1500 AGATTAATTACTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 1560 AGCAATAAAGAAACTAAATT GATCGTCCCG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 1620 AACGGGTCCATAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680 GTAGATCATACAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 1740 ATTTCACAATTTATTGGAGA TAATTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 1800 GTTAAAGGAAAACCTTCTAT TCATTTAATA GATGAAAATA CTGGATATAT TCATTATGAA 1860 GATACAAATAATAATTTAGA AGATTATCAA ACTATTAATA AACGTTTTAC TACAGGAACT 1920 GATTTAAAGGGAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 1980 AACTTTATTATTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 2040 ACAAATAATTGGACGAGTAC GGGATCAACT AATATTAGCG GTAATACACT CACTCTTTAT 2100 CAGGGAGGACGAGGGATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 2160 GTGTATTTTTCTGTGTCCGG AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA 2220 TTTGAAAAAAGATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 2280 GAGAAAGATAACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 2340 GTACATTTTTACGATGTCTC TATTAAGTAA CCCAA 2375 790 amino acids amino acid singlelinear protein 36a 4 Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala LeuPro Ser Phe 1 5 10 15 Ile Asp Tyr Phe Asn Gly Ile Tyr Gly Phe Ala ThrGly Ile Lys Asp 20 25 30 Ile Met Asn Met Ile Phe Lys Thr Asp Thr Gly GlyAsp Leu Thr Leu 35 40 45 Asp Glu Ile Leu Lys Asn Gln Gln Leu Leu Asn AspIle Ser Gly Lys 50 55 60 Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu IleAla Gln Gly Asn 65 70 75 80 Leu Asn Thr Glu Leu Ser Lys Glu Ile Leu LysIle Ala Asn Glu Gln 85 90 95 Asn Gln Val Leu Asn Asp Val Asn Asn Lys LeuAsp Ala Ile Asn Thr 100 105 110 Met Leu Arg Val Tyr Leu Pro Lys Ile ThrSer Met Leu Ser Asp Val 115 120 125 Met Lys Gln Asn Tyr Ala Leu Ser LeuGln Ile Glu Tyr Leu Ser Lys 130 135 140 Gln Leu Gln Glu Ile Ser Asp LysLeu Asp Ile Ile Asn Val Asn Val 145 150 155 160 Leu Ile Asn Ser Thr LeuThr Glu Ile Thr Pro Ala Tyr Gln Arg Ile 165 170 175 Lys Tyr Val Asn GluLys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 180 185 190 Ser Ser Lys ValLys Lys Asp Gly Ser Pro Ala Asn Ile Leu Asp Glu 195 200 205 Leu Thr GluLeu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 210 215 220 Asp GlyPhe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 225 230 235 240Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu Ile 245 250255 Thr Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 260265 270 Asn Phe Leu Ile Val Leu Thr Ala Leu Gln Ala Lys Ala Phe Leu Thr275 280 285 Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp Ile Asp TyrThr 290 295 300 Ser Ile Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu PheArg Val 305 310 315 320 Asn Ile Leu Pro Thr Leu Ser Asn Thr Phe Ser AsnPro Asn Tyr Ala 325 330 335 Lys Val Lys Gly Ser Asp Glu Asp Ala Lys MetIle Val Glu Ala Lys 340 345 350 Pro Gly His Ala Leu Ile Gly Phe Glu IleSer Asn Asp Ser Ile Thr 355 360 365 Val Leu Lys Val Tyr Glu Ala Lys LeuLys Gln Asn Tyr Gln Val Asp 370 375 380 Lys Asp Ser Leu Ser Glu Val IleTyr Gly Asp Met Asp Lys Leu Leu 385 390 395 400 Cys Pro Asp Gln Ser GluGln Ile Tyr Tyr Thr Asn Asn Ile Val Phe 405 410 415 Pro Asn Glu Tyr ValIle Thr Lys Ile Asp Phe Thr Lys Lys Met Lys 420 425 430 Thr Leu Arg TyrGlu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 435 440 445 Glu Ile AspLeu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 450 455 460 Lys ThrLeu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 465 470 475 480Ile Ser Glu Thr Phe Leu Thr Pro Ile Asn Gly Phe Gly Leu Gln Ala 485 490495 Asp Glu Asn Ser Arg Leu Ile Thr Leu Thr Cys Lys Ser Tyr Leu Arg 500505 510 Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu Ile515 520 525 Val Pro Pro Ser Gly Phe Ile Ser Asn Ile Val Glu Asn Gly SerIle 530 535 540 Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn Lys AsnAla Tyr 545 550 555 560 Val Asp His Thr Gly Gly Val Asn Gly Thr Lys AlaLeu Tyr Val His 565 570 575 Lys Asp Gly Gly Ile Ser Gln Phe Ile Gly AspAsn Leu Lys Pro Lys 580 585 590 Thr Glu Tyr Val Ile Gln Tyr Thr Val LysGly Lys Pro Ser Ile His 595 600 605 Leu Ile Asp Glu Asn Thr Gly Tyr IleHis Tyr Glu Asp Thr Asn Asn 610 615 620 Asn Leu Glu Asp Tyr Gln Thr IleAsn Lys Arg Phe Thr Thr Gly Thr 625 630 635 640 Asp Leu Lys Gly Val TyrLeu Ile Leu Lys Ser Gln Asn Gly Asp Glu 645 650 655 Ala Trp Gly Asp AsnPhe Ile Ile Leu Glu Ile Ser Pro Ser Glu Lys 660 665 670 Leu Leu Ser ProGlu Leu Ile Asn Thr Asn Asn Trp Thr Ser Thr Gly 675 680 685 Ser Thr AsnIle Ser Gly Asn Thr Leu Thr Leu Tyr Gln Gly Gly Arg 690 695 700 Gly IleLeu Lys Gln Asn Leu Gln Leu Asp Ser Phe Ser Thr Tyr Arg 705 710 715 720Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg Ile Arg Asn Ser 725 730735 Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 740745 750 Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr Ile Glu755 760 765 Leu Ser Gln Gly Asn Asn Leu Tyr Gly Gly Pro Ile Val His PheTyr 770 775 780 Asp Val Ser Ile Lys Pro 785 790 2370 base pairs nucleicacid single linear DNA (genomic) 81F 5 ATGAACAAGA ATAATACTAA ATTAAGCACAAGAGCCTTAC CAAGTTTTAT TGATTATTTT 60 AATGGCATTT ATGGATTTGC CACTGGTATCAAAGACATTA TGAACATGAT TTTTAAAACG 120 GATACAGGTG GTGATCTAAC CCTAGACGAAATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180 ATTTCTGGTA AATTGGATGG GGTGAATGGAAGCTTAAATG ATCTTATCGC ACAGGGAAAC 240 TTAAATACAG AATTATCTAA AGAAATATTAAAAATTGCAA ATGAACAAAA TCAAGTTTTA 300 AATGATGTTG ATAACAAACT CGATGCGATAAATACGATGC TTCGGGTATA TCTACCTAAA 360 ATTACCTCTA TGTTGAGTGA TGTAATGAAACAAAATTATG CGCTAAGTCT GCAAATAGAA 420 TACTTAAGTA AACAATTGCA AGAGATTTCTGATAAGTTGG ATATTATTAA TGTAAATGTA 480 CTTATTAACT CTACACTTAC TGAAATTACACCTGCGTATC AAAGGATTAA ATATGTGAAC 540 GAAAAATTTG AGGAATTAAC TTTTGCTACAGAAACTAGTT CAAAAGTAAA AAAGGATGGC 600 TCTCCTGCAG ATATTCTTGA TGAGTTAACTGAGTTAACTG AACTAGCGAA AAGTGTAACA 660 AAAAATGATG TGGATGGTTT TGAATTTTACCTTAATACAT TCCACGATGT AATGGTAGGA 720 AATAATTTAT TCGGGCGTTC AGCTTTAAAAACTGCATCGG AATTAATTAC TAAAGAAAAT 780 GTGAAAACAA GTGGCAGTGA GGTCGGAAATGTTTATAACT TCTTAATTGT ATTAACAGCT 840 CTGCAAGCAA AAGCTTTTCT TACTTTAACAACATGCCGAA AATTATTAGG CTTAGCAGAT 900 ATTGATTATA CTTCTATTAT GAATGAACATTTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960 AACATCCTCC CTACACTTTC TAATACTTTTTCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020 AGTGATGAAG ATGCAAAGAT GATTGTGGAAGCTAAACCAG GACATGCATT GGTTGGGTTT 1080 GAAATTAGTA ATGATTCAAT TACAGTATTAAAAGTATATG AGGCTAAGCT AAAACAAAAT 1140 TATCAAGTTG ATAAGGATTC CTTATCGGAAGTTATTTATG GTGATATGGA TAAATTATTG 1200 TGCCCAGATC AATCTGAACA AATCTATTATACAAATAACA TAGTATTTCC AAATGAATAT 1260 GTAATTACTA AAATTGATTT TACTAAAAAAATGAAAACTT TAAGATATGA GGTAACAGCG 1320 AATTTTTATG ATTCTTCTAC AGGAGAAATTGACTTAAATA AGAAAAAAGT AGAATCAAGT 1380 GAAGCGGAGT ATAGAACGTT AAGTGCTAATGATGATGGAG TGTATATGCC GTTAGGTGTC 1440 ATCAGTGAAA CATTTTTGAC TCCGATTAATGGGTTTGGCC TCCAAGCTGA TGAAAATTCA 1500 AGATTAATTA CTTTAACATG TAAATCATATTTAAGAGAAC TACTGCTAGC AACAGACTTA 1560 AGCAATAAAG AAACTAAATT GATCGTCCCGCCCAGTGGTT TTATTAAAAA TATTGTAGAG 1620 AACGGGTCCA TAGAAGAGGA CAATTTAGAGCCGTGGAAAG CAAATAATAA GAATGAGTAT 1680 GTAGATCATA CAGGCGGAGT GAATGGRACTAAAGCTTTAT ATGTTCATAA GGACGGAGGA 1740 ATTTCACAAT TTATTGGAGA TAAGTTAAAACCGAAAACTG AGTATGTAAT CCAATATACT 1800 GTTAAAGGAA AACCTTCTAT TCATTTAAAAGATGAAAATA CTGGATATAT TCATTATGAA 1860 GATACAAATA ATAATTTAGA AGATTATCAAACTATTACTA AACGTTTTAC TACAGGAACT 1920 GATTTAAAGG GAGTGTATTT AATTTTAAAAAGTCAAAATG GAGATGAAGC TTGGGGAGAT 1980 AACTTTATTA TTTTGGAAAT TAGTCCTTCTGAAAAGTTAT TAAGTCCAGA ATTAATTAAT 2040 ACAAATAATT GGACGAGTAC GGGATCAACTAATATTAGCG GTAATACACT CACTCTTTAT 2100 CAGGGAGGAC GAGGAATTCT AAAACAAAACCTTCAATTAG ATAGTTTTTC AACTTATAGA 2160 GTGTATTTTT CTGTGTCCGG AGATGCTAATGTAAGGATTA GAAATTCTAG GGAAGTGTTA 2220 TTTGAAAAAA GATATATGAG CGGTGCTAAAGATGTTTCTG AAATTTTCAC TACAAAATTT 2280 GGGAAAGATA ACTTTTATAT AGAGCTTTCTCAAGGGAATA ATTTAAATGG TGGCCCTATT 2340 GTACAGTTTC CCGATGTCTC TATTAAGTAA2370 789 amino acids amino acid single linear protein 81F 6 Met Asn LysAsn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 1 5 10 15 Ile AspTyr Phe Asn Gly Ile Tyr Gly Phe Ala Thr Gly Ile Lys Asp 20 25 30 Ile MetAsn Met Ile Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 35 40 45 Asp GluIle Leu Lys Asn Gln Gln Leu Leu Asn Asp Ile Ser Gly Lys 50 55 60 Leu AspGly Val Asn Gly Ser Leu Asn Asp Leu Ile Ala Gln Gly Asn 65 70 75 80 LeuAsn Thr Glu Leu Ser Lys Glu Ile Leu Lys Ile Ala Asn Glu Gln 85 90 95 AsnGln Val Leu Asn Asp Val Asp Asn Lys Leu Asp Ala Ile Asn Thr 100 105 110Met Leu Arg Val Tyr Leu Pro Lys Ile Thr Ser Met Leu Ser Asp Val 115 120125 Met Lys Gln Asn Tyr Ala Leu Ser Leu Gln Ile Glu Tyr Leu Ser Lys 130135 140 Gln Leu Gln Glu Ile Ser Asp Lys Leu Asp Ile Ile Asn Val Asn Val145 150 155 160 Leu Ile Asn Ser Thr Leu Thr Glu Ile Thr Pro Ala Tyr GlnArg Ile 165 170 175 Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe AlaThr Glu Thr 180 185 190 Ser Ser Lys Val Lys Lys Asp Gly Ser Pro Ala AspIle Leu Asp Glu 195 200 205 Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser ValThr Lys Asn Asp Val 210 215 220 Asp Gly Phe Glu Phe Tyr Leu Asn Thr PheHis Asp Val Met Val Gly 225 230 235 240 Asn Asn Leu Phe Gly Arg Ser AlaLeu Lys Thr Ala Ser Glu Leu Ile 245 250 255 Thr Lys Glu Asn Val Lys ThrSer Gly Ser Glu Val Gly Asn Val Tyr 260 265 270 Asn Phe Leu Ile Val LeuThr Ala Leu Gln Ala Lys Ala Phe Leu Thr 275 280 285 Leu Thr Thr Cys ArgLys Leu Leu Gly Leu Ala Asp Ile Asp Tyr Thr 290 295 300 Ser Ile Met AsnGlu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 305 310 315 320 Asn IleLeu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 325 330 335 LysVal Lys Gly Ser Asp Glu Asp Ala Lys Met Ile Val Glu Ala Lys 340 345 350Pro Gly His Ala Leu Val Gly Phe Glu Ile Ser Asn Asp Ser Ile Thr 355 360365 Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gln Asn Tyr Gln Val Asp 370375 380 Lys Asp Ser Leu Ser Glu Val Ile Tyr Gly Asp Met Asp Lys Leu Leu385 390 395 400 Cys Pro Asp Gln Ser Glu Gln Ile Tyr Tyr Thr Asn Asn IleVal Phe 405 410 415 Pro Asn Glu Tyr Val Ile Thr Lys Ile Asp Phe Thr LysLys Met Lys 420 425 430 Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr AspSer Ser Thr Gly 435 440 445 Glu Ile Asp Leu Asn Lys Lys Lys Val Glu SerSer Glu Ala Glu Tyr 450 455 460 Arg Thr Leu Ser Ala Asn Asp Asp Gly ValTyr Met Pro Leu Gly Val 465 470 475 480 Ile Ser Glu Thr Phe Leu Thr ProIle Asn Gly Phe Gly Leu Gln Ala 485 490 495 Asp Glu Asn Ser Arg Leu IleThr Leu Thr Cys Lys Ser Tyr Leu Arg 500 505 510 Glu Leu Leu Leu Ala ThrAsp Leu Ser Asn Lys Glu Thr Lys Leu Ile 515 520 525 Val Pro Pro Ser GlyPhe Ile Lys Asn Ile Val Glu Asn Gly Ser Ile 530 535 540 Glu Glu Asp AsnLeu Glu Pro Trp Lys Ala Asn Asn Lys Asn Glu Tyr 545 550 555 560 Val AspHis Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 565 570 575 LysAsp Gly Gly Ile Ser Gln Phe Ile Gly Asp Lys Leu Lys Pro Lys 580 585 590Thr Glu Tyr Val Ile Gln Tyr Thr Val Lys Gly Lys Pro Ser Ile His 595 600605 Leu Lys Asp Glu Asn Thr Gly Tyr Ile His Tyr Glu Asp Thr Asn Asn 610615 620 Asn Leu Glu Asp Tyr Gln Thr Ile Thr Lys Arg Phe Thr Thr Gly Thr625 630 635 640 Asp Leu Lys Gly Val Tyr Leu Ile Leu Lys Ser Gln Asn GlyAsp Glu 645 650 655 Ala Trp Gly Asp Asn Phe Ile Ile Leu Glu Ile Ser ProSer Glu Lys 660 665 670 Leu Leu Ser Pro Glu Leu Ile Asn Thr Asn Asn TrpThr Ser Thr Gly 675 680 685 Ser Thr Asn Ile Ser Gly Asn Thr Leu Thr LeuTyr Gln Gly Gly Arg 690 695 700 Gly Ile Leu Lys Gln Asn Leu Gln Leu AspSer Phe Ser Thr Tyr Arg 705 710 715 720 Val Tyr Phe Ser Val Ser Gly AspAla Asn Val Arg Ile Arg Asn Ser 725 730 735 Arg Glu Val Leu Phe Glu LysArg Tyr Met Ser Gly Ala Lys Asp Val 740 745 750 Ser Glu Ile Phe Thr ThrLys Phe Gly Lys Asp Asn Phe Tyr Ile Glu 755 760 765 Leu Ser Gln Gly AsnAsn Leu Asn Gly Gly Pro Ile Val Gln Phe Pro 770 775 780 Asp Val Ser IleLys 785 2375 base pairs nucleic acid single linear DNA (genomic) Jav90 7ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 120GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240TTAAATACAG AATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 300AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCGGGTATA TCTACCTAAA 360ATTACCTCTA TGTTGAGTGA TGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATAGAA 420TACTTAAGTA AACAATTGCA AGAGATTTCT GATAAGTTGG ATATTATTAA TGTAAATGTA 480CTTATTAACT CTACACTTAC TGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 540GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 600TCTCCTGCAG ATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 660AAAAATGATG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 780GTGAAAACAA GTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 840CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960AACATCCTCC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GACATGCATT GATTGGGTTT 1080GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 1140TATCAAGTCG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 1200TGCCCAGATC AATCTGAACA AATCTATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 1320AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 1380GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 1440ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 1500AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 1560AGCAATAAAG AAACTAAATT GATYGTCCCG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 1620AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680GTAGATCATA CAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 1740ATTTCACAAT TTATTGGAGA TAAGTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 1800GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA 1860GATACAAATA ATAATTTAGA AGATTATCAA ACTATTAATA AACGTTTTAC TACAGGAACT 1920GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 1980AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 2040ACAAATAATT GGACGAGTAC GGGATCAACT AATATTAGCG GTAATACACT CACTCTTTAT 2100CAGGGAGGAC GAGGGATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 2160GTGTATTTTT CTGTGTCCGG AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA 2220TTTGAAAAAA GATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 2280GAGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 2340GTACATTTTT ACGATGTCTC TATTAAGTAA CCCAA 2375 790 amino acids amino acidsingle linear protein Jav90 8 Met Asn Lys Asn Asn Thr Lys Leu Ser ThrArg Ala Leu Pro Ser Phe 1 5 10 15 Ile Asp Tyr Phe Asn Gly Ile Tyr GlyPhe Ala Thr Gly Ile Lys Asp 20 25 30 Ile Met Asn Met Ile Phe Lys Thr AspThr Gly Gly Asp Leu Thr Leu 35 40 45 Asp Glu Ile Leu Lys Asn Gln Gln LeuLeu Asn Asp Ile Ser Gly Lys 50 55 60 Leu Asp Gly Val Asn Gly Ser Leu AsnAsp Leu Ile Ala Gln Gly Asn 65 70 75 80 Leu Asn Thr Glu Leu Ser Lys GluIle Leu Lys Ile Ala Asn Glu Gln 85 90 95 Asn Gln Val Leu Asn Asp Val AsnAsn Lys Leu Asp Ala Ile Asn Thr 100 105 110 Met Leu Arg Val Tyr Leu ProLys Ile Thr Ser Met Leu Ser Asp Val 115 120 125 Met Lys Gln Asn Tyr AlaLeu Ser Leu Gln Ile Glu Tyr Leu Ser Lys 130 135 140 Gln Leu Gln Glu IleSer Asp Lys Leu Asp Ile Ile Asn Val Asn Val 145 150 155 160 Leu Ile AsnSer Thr Leu Thr Glu Ile Thr Pro Ala Tyr Gln Arg Ile 165 170 175 Lys TyrVal Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 180 185 190 SerSer Lys Val Lys Lys Asp Gly Ser Pro Ala Asp Ile Leu Asp Glu 195 200 205Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 210 215220 Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 225230 235 240 Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu LeuIle 245 250 255 Thr Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly AsnVal Tyr 260 265 270 Asn Phe Leu Ile Val Leu Thr Ala Leu Gln Ala Lys AlaPhe Leu Thr 275 280 285 Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala AspIle Asp Tyr Thr 290 295 300 Ser Ile Met Asn Glu His Leu Asn Lys Glu LysGlu Glu Phe Arg Val 305 310 315 320 Asn Ile Leu Pro Thr Leu Ser Asn ThrPhe Ser Asn Pro Asn Tyr Ala 325 330 335 Lys Val Lys Gly Ser Asp Glu AspAla Lys Met Ile Val Glu Ala Lys 340 345 350 Pro Gly His Ala Leu Ile GlyPhe Glu Ile Ser Asn Asp Ser Ile Thr 355 360 365 Val Leu Lys Val Tyr GluAla Lys Leu Lys Gln Asn Tyr Gln Val Asp 370 375 380 Lys Asp Ser Leu SerGlu Val Ile Tyr Gly Asp Met Asp Lys Leu Leu 385 390 395 400 Cys Pro AspGln Ser Glu Gln Ile Tyr Tyr Thr Asn Asn Ile Val Phe 405 410 415 Pro AsnGlu Tyr Val Ile Thr Lys Ile Asp Phe Thr Lys Lys Met Lys 420 425 430 ThrLeu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 435 440 445Glu Ile Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 450 455460 Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 465470 475 480 Ile Ser Glu Thr Phe Leu Thr Pro Ile Asn Gly Phe Gly Leu GlnAla 485 490 495 Asp Glu Asn Ser Arg Leu Ile Thr Leu Thr Cys Lys Ser TyrLeu Arg 500 505 510 Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu ThrLys Leu Ile 515 520 525 Val Pro Pro Ser Gly Phe Ile Ser Asn Ile Val GluAsn Gly Ser Ile 530 535 540 Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala AsnAsn Lys Asn Ala Tyr 545 550 555 560 Val Asp His Thr Gly Gly Val Asn GlyThr Lys Ala Leu Tyr Val His 565 570 575 Lys Asp Gly Gly Ile Ser Gln PheIle Gly Asp Lys Leu Lys Pro Lys 580 585 590 Thr Glu Tyr Val Ile Gln TyrThr Val Lys Gly Lys Pro Ser Ile His 595 600 605 Leu Lys Asp Glu Asn ThrGly Tyr Ile His Tyr Glu Asp Thr Asn Asn 610 615 620 Asn Leu Glu Asp TyrGln Thr Ile Asn Lys Arg Phe Thr Thr Gly Thr 625 630 635 640 Asp Leu LysGly Val Tyr Leu Ile Leu Lys Ser Gln Asn Gly Asp Glu 645 650 655 Ala TrpGly Asp Asn Phe Ile Ile Leu Glu Ile Ser Pro Ser Glu Lys 660 665 670 LeuLeu Ser Pro Glu Leu Ile Asn Thr Asn Asn Trp Thr Ser Thr Gly 675 680 685Ser Thr Asn Ile Ser Gly Asn Thr Leu Thr Leu Tyr Gln Gly Gly Arg 690 695700 Gly Ile Leu Lys Gln Asn Leu Gln Leu Asp Ser Phe Ser Thr Tyr Arg 705710 715 720 Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg Ile Arg AsnSer 725 730 735 Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala LysAsp Val 740 745 750 Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn PheTyr Ile Glu 755 760 765 Leu Ser Gln Gly Asn Asn Leu Tyr Gly Gly Pro IleVal His Phe Tyr 770 775 780 Asp Val Ser Ile Lys Pro 785 790 47 basepairs nucleic acid single linear DNA (genomic) 9 GCTCTAGAAG GAGGTAACTTATGAACAAGA ATAATACTAA ATTAAGC 47 2035 base pairs nucleic acid singlelinear DNA (genomic) 158C2 10 ATGAACAAGA ATAATACTAA ATTAAGCGCAAGGGCCTACC GAGTTTTATT GATTATTTTA 60 ATGGCATTTA TGGATTTGCC ACTGGTATCAAAGACATTAT GAATATGATT TTTAAAACGG 120 ATACAGGTGG TAATCTAACC TTAGACGAAATCCTAAAGAA TCAGCAGTTA CTAAATGAGA 180 TTTCTGGTAA ATTGGATGGG GTAAATGGGAGCTTAAATGA TCTTATCGCA CAGGGAAACT 240 TAAATACAGA ATTAGCTAAG CAAATCTTAAAAGTTGCAAA TGAACAAAAT CAAGTTTTAA 300 ATGATGTTAA TAACAAACTA GACTGCGATAAATACGATGC TTAAAATATA TCTACCTAAA 360 ATTCACATCT ATGTTAAGTG ATGTACTGAAGCCAAAATTA TGTGCTTAAG TCTTGCAAAT 420 TGGAATTACC TTTAAGTAAC ATCTGCACCTTGGCAAGAAA TCTCCGACAA GCTAGATATT 480 ATTAACGTAA ATGTGCTTAT TAACTCTACGCTTACTGAAA TTACACCTGC GTATCAACGA 540 ATTAAATATG TGAATGAAAA ATTTGACGATTTAACTTTTG CTACAGAAAA CACTTTAAAA 600 GTAAAAAAGG ATAGCTCTCC TGCTGATATTCTTGACGAGT TAACTGAATT AACTGAACTA 660 GCGAAAAGTG TTACAAAAAA TGACGTGGATGGTTTTGAAT TTTACCTTAA TACATTCCAT 720 GATGTAATGG TGGGAAATAA TTTATTCGGTCGTTCAGCTT TAAAAACTGC TTCGGAATTA 780 ATTGCTAAAG AAAATGTGAA AACAAGTGGCAGTGAAGTAG GAAATGTTTA TAATTTCTTA 840 ATTGTATTAA CAGCTCTACA AGCAAAAGCTTTTCTTACTT TAACAACATG CCGAAAATTA 900 TTAGGCTTAG CAGATATTGA TTATACTTCTATCATGAATG AGCATTTAAA TAAGGAAAAA 960 GAGGAATTTA GAGTAAACAT CCTTCCCACACTTTCTAATA CCTTTTCTAA TCCTAATTAT 1020 GCAAAAGCTA AGGGAAGTAA TGAAGATACAAAGATGATTG TGGAAGCTAA ACCAGGATAT 1080 GTTTTGGTTG GATTTGAAAT GAGCAATAATTCAATTACAG TATTAAAAGC ATATCAAGCT 1140 AAGCTAAAAA AAGATTATCA AATTGATAAGGATTCGTTAT CAGAAATAAT ATATAGTACG 1200 TGATACGGAT AAATTATTAT GTCCGGATCAATCTGAACAA TATATTATAC AAAGAACATA 1260 GCATTTCCAA ATGAATATGT TATTACTAAAATTGCTTTTA CTAAAAAAAT GAACAGTTTA 1320 AGGTATGAGG CGACAGCGAA TTTTTATGATTCTTCTACAG GGGATATTGA TCTAAATAAG 1380 ACAAAAGTAG AATCAAGTGA AGCGGAGTATAGTATGCTAA AAGCTAGTGA TGATGAAGTT 1440 TACATGCCGC TAGGTCTTAT CAGTGAAACATTTTTAAATC CAATTAATGG ATTTAGGCTT 1500 GCAGTCGATG AAAATTCCAG ACTAGTAACTTTAACATGTA GATCATATTT AAGAGAGACA 1560 TTGTTAGCGA CAGATTTAAA TAATAAAGAAACTAAATTGA TTGTCCCACC TAATGTTTTT 1620 ATTAGCAATA TTGTAGAGAA TGGAAATATAGAAATGGACA CCTTAGAACC ATGGAAGGCA 1680 AATAATGAGA ATGCGAATGT AGATTATTCAGGCGGAGTGA ATGGAACTAG AGCTTTATAT 1740 GTTCATAAGG ATGGTGAATT CTCACATTTTATTGGAGACA AGTTGAAATC TAAAACAGAA 1800 TACTTGATTC GATATATTGT AAAAGGAAAAGCTTCTATTT TTTTAAAAGA TGAAAGAAAT 1860 GAAAATTACA TTTACGAGGA TACAAATAATAATTTAGAAG ATTATCAAAC TATTACTAAA 1920 CGTTTTACTA CAGGAACTGA TTCGACAGGATTTTATTTAT TTTTTACTAC TCAAGATGGA 1980 AATGAAGCTT GGGGAGACAC TTTTTTTCTCTAGAAAGAGG TAACTTATGA ACAAG 2035 21 base pairs nucleic acid singlelinear DNA (genomic) 11 CATCCTCCCT ACACTTTCTA A 21 950 base pairsnucleic acid single linear DNA (genomic) 49C-ptl 12 AAACTAGAGGGAGTGATAAG GATGCGAAAA TCATTATGGA AGCTAAACCT GGATATGCTT 60 TAGTTGGATTTGAAATAAGT AAGGATTCAA TTGCAGTATT AAAAGTTTAT CAGGCAAAGC 120 TAAAACACAACTATCAAATT GATAAGGATT CGTTATCAGA AATTGTTTAT GGTGATATAG 180 ATAAATTATTATGTCCGGAT CAATCTGAAC AAATGTATTA TACAAATAAA ATAGCATTTC 240 CAAATGAATATGTTATCACT AAAATTGCTT TTACTAAAAA ACTGAACAGT TTAAGATATG 300 AGGTCACAGCGAATTTTTAT GACTCTTCTA CAGGAGATAT TGATCTAAAT AAGAAAAAAA 360 TAGAATCAAGTGAAGCGGAG TTTAGTATGC TAAATGCTAA TAATGATGGT GTTTATATGC 420 CGATAGGTACTATAAGTGAA ACATTTTTGA CTCCAATTAA TGGATTTGGC CTCGTAGTCG 480 ATGAAAATTCAAGACTAGTA ACTTTGACAT GTAAATCATA TTTAAGAGAG ACATTGTTAG 540 CAACAGACTTAAGTAATAAA GAAACTAAAC TGATTGTCCC ACCTAATGGT TTTATTAGCA 600 ATATTGTAGAAAATGGGAAC TTAGAGGGAG AAAACTTAGA GCCGTGGGAA AGCAAATAAC 660 AAAAATGCGTATGTAGATCA TACCGGAGGT GTAAATGGAA CTAAAGTTTT ATATGTTCAT 720 GAGGATGGTGAGTTCTCACA ATTTATTGGG GATAAATTGA AATTGAAAAC AGAATATGTA 780 ATTCCATATATTGTAAAGGG GAAAGCTGCT ATTTATTTAA AAGATGAAAA AAATGGGGAT 840 TACATATCATGAAGAAACAT CATAATGCAA TTGAAGATTT TTCCAGCTGT AACTTCAATA 900 ATGATTTTCGCATCCTTATC ATCCCTCTAG CTTTTTCATA ATAGGATAGA 950 20 base pairs nucleicacid single linear DNA (genomic) 13 AAATTATGCG CTAAGTCTGC 20 19 basepairs nucleic acid single linear DNA (genomic) 14 TTGATCCGGA CATAATAAT19 176 base pairs nucleic acid single linear DNA (genomic) 49C-pt2 15GTAAATTATG CGCTAAGTCT GCACCTTTTT TCACTGTTAC TAAACATCAC TTTTCCTATA 60TCCCCTTAGC TCTTATGGAT TATTGAGCAA ACTTATCTTG TTAATTACTA CTCCCCATCA 120TATGCTAAAC AAAAACCAAA CAAACATTAT CTATTATATG TCCGGATCAA AATGTA 176 20base pairs nucleic acid single linear DNA (genomic) 16 GGRTTAMTTGGRTAYTATTT 20 20 base pairs nucleic acid single linear DNA (genomic) 17ATATCKWAYA TTKGCATTTA 20 1076 base pairs nucleic acid single linear DNA(genomic) 10E1 18 TGGGATTACT TGGATATTAT TTCCAGGATC AAAAGTTTCA GCAACTTGCTTTGATGGCAC 60 ATAGACAAGC TTCTGATTTG GAAATCCCGA AAGATGACGT GAAACAGTTACTATCCAAGG 120 AGCAGCAACA CATTCAATCT GTTAGATGGC TTGGCTATAT TCAGCCACCTCAAACAGGAG 180 ACTATGTATT GTCAACCTCA TCCGACCAAC AGGTCGTGAT TGAACTCGATGGAAAAACCA 240 TTGTCAATCA AACTTCTATG ACAGAACCGA TTCAACTCGA AAAAGATAAGCTCTATAAAA 300 TTAGAATTGA ATATGTCCCA GAAGATACAA AAGAACAAGA GAACCTCCTTGACTTTCAGC 360 TCAACTGGTC GATTTCAGGA TCAGAGATAG AACCAATTCC GGAGAATGCTTTCCATTTAC 420 CAAATTTTTC TCGTAAACAA GATCAAGAGA AAATCATCCC TGAAACCAGTTTGTTTCAGG 480 AACAAGGAGA TGAGAAAAAA GTATCTCGCA GTAAGAGATC TTTAGCTACAAATCCTATCC 540 GTGATACAGA TGATGATAGT ATTTATGATG AATGGGAAAC GGAAGGATACACGATACGGG 600 AACAAATAGC AGTGAAATGG GACGATTCTA TGAAGGATAG AGGTTATACCAAATATGTGT 660 CAAACCCCTA TAAGTCTCAT ACAGTAGGAG ATCCATACAC AGATTGGGAAAAAGCGGCTG 720 GCCGTATCGA TAACGGTGTC AAAGCAGAAG CCAGAAATCC TTTAGTCGCGGCCTATCCAA 780 CTGTTGGTGT ACATATGGAA AGATTAATTG TCTCCGAAAA ACAAAATATATCAACAGGGC 840 TTGGAAAAAC TGTATCTGCG TCTATGTCCG CAAGCAATAC CGCAGCGATTACGGCAGGTA 900 TTGATGCAAC AGCCGGTGCC TCTTTACTCG GGCCATCTGG AAGTGTCACGGCTCATTTTT 960 CTTATACAGG ATCTAGTACA TCCACCGTTG AAGATAGCTC CAGCCGGAATTGGAGTCAAG 1020 ACCTTGGGAT CGATACGGGA CAATCTGCAT ATTTAAATGC CAAATGTACGATATAA 1076 357 amino acids amino acid single linear peptide 10E1 19 GlyLeu Leu Gly Tyr Tyr Phe Gln Asp Gln Lys Phe Gln Gln Leu Ala 1 5 10 15Leu Met Ala His Arg Gln Ala Ser Asp Leu Glu Ile Pro Lys Asp Asp 20 25 30Val Lys Gln Leu Leu Ser Lys Glu Gln Gln His Ile Gln Ser Val Arg 35 40 45Trp Leu Gly Tyr Ile Gln Pro Pro Gln Thr Gly Asp Tyr Val Leu Ser 50 55 60Thr Ser Ser Asp Gln Gln Val Val Ile Glu Leu Asp Gly Lys Thr Ile 65 70 7580 Val Asn Gln Thr Ser Met Thr Glu Pro Ile Gln Leu Glu Lys Asp Lys 85 9095 Leu Tyr Lys Ile Arg Ile Glu Tyr Val Pro Glu Asp Thr Lys Glu Gln 100105 110 Glu Asn Leu Leu Asp Phe Gln Leu Asn Trp Ser Ile Ser Gly Ser Glu115 120 125 Ile Glu Pro Ile Pro Glu Asn Ala Phe His Leu Pro Asn Phe SerArg 130 135 140 Lys Gln Asp Gln Glu Lys Ile Ile Pro Glu Thr Ser Leu PheGln Glu 145 150 155 160 Gln Gly Asp Glu Lys Lys Val Ser Arg Ser Lys ArgSer Leu Ala Thr 165 170 175 Asn Pro Ile Arg Asp Thr Asp Asp Asp Ser IleTyr Asp Glu Trp Glu 180 185 190 Thr Glu Gly Tyr Thr Ile Arg Glu Gln IleAla Val Lys Trp Asp Asp 195 200 205 Ser Met Lys Asp Arg Gly Tyr Thr LysTyr Val Ser Asn Pro Tyr Lys 210 215 220 Ser His Thr Val Gly Asp Pro TyrThr Asp Trp Glu Lys Ala Ala Gly 225 230 235 240 Arg Ile Asp Asn Gly ValLys Ala Glu Ala Arg Asn Pro Leu Val Ala 245 250 255 Ala Tyr Pro Thr ValGly Val His Met Glu Arg Leu Ile Val Ser Glu 260 265 270 Lys Gln Asn IleSer Thr Gly Leu Gly Lys Thr Val Ser Ala Ser Met 275 280 285 Ser Ala SerAsn Thr Ala Ala Ile Thr Ala Gly Ile Asp Ala Thr Ala 290 295 300 Gly AlaSer Leu Leu Gly Pro Ser Gly Ser Val Thr Ala His Phe Ser 305 310 315 320Tyr Thr Gly Ser Ser Thr Ser Thr Val Glu Asp Ser Ser Ser Arg Asn 325 330335 Trp Ser Gln Asp Leu Gly Ile Asp Thr Gly Gln Ser Ala Tyr Leu Asn 340345 350 Ala Lys Cys Thr Ile 355 1045 base pairs nucleic acid singlelinear DNA (genomic) 31J2 20 TGGGTTACTT GGGTATTATT TTAAAGGAAA AGATTTTAATAATCTTACTA TATTTGCTCC 60 AACACGTGAG AATACTCTTA TTTATGATTT AGAAACAGCGAATTCTTTAT TAGATAAGCA 120 ACAACAAACC TATCAATCTA TTCGTTGGAT CGGTTTAATAAAAAGCAAAA AAGCTGGAGA 180 TTTTACCTTT CAATTATCGG ATGATGAGCA TGCTATTATAGAAATCGATG GGAAAGTTAT 240 TTCGCAAAAA GGCCAAAAGA AACAAGTTGT TCATTTAGAAAAAGATAAAT TAGTTCCCAT 300 CAAAATTGAA TATCAATCTG ATAAAGCGTT AAACCCAGATAGTCAAATGT TTAAAGAATT 360 GAAATTATTT AAAATAAATA GTCAAAAACA ATCTCAGCAAGTGCAACAAG ACGAATTGAG 420 AAATCCTGAA TTTGGTAAAG AAAAAACTCA AACATATTTAAAGAAAGCAT CGAAAAGCAG 480 CTTGTTTAGC AATAAAAGTA AACGAGATAT AGATGAAGATATAGATGAGG ATACAGATAC 540 AGATGGAGAT GCCATTCCTG ATGTATGGGA AGAAAATGGGTATACCATCA AAGGAAGAGT 600 AGCTGTTAAA TGGGACGAAG GATTAGCTGA TAAGGGATATAAAAAGTTTG TTTCCAATCC 660 TTTTAGACAG CACACTGCTG GTGACCCCTA TAGTGACTATGAAAAGGCAT CAAAAGATTT 720 GGATTTATCT AATGCAAAAG AAACATTTAA TCCATTGGTGGCTGCTTTTC CAAGTGTCAA 780 TGTTAGCTTG GAAAATGTCA CCATATCAAA AGATGAAAATAAAACTGCTG AAATTGCGTC 840 TACTTCATCG AATAATTGGT CCTATACAAA TACAGAGGGGGCATCTATTG AAGCTGGAAT 900 TGGACCAGAA GGTTTGTTGT CTTTTGGAGT AAGTGCCAATTATCAACATT CTGAAACAGT 960 GGCCAAAGAG TGGGGTACAA CTAAGGGAGA CGCAACACAATATAATACAG CTTCAGCAGG 1020 ATATCTAAAT GCCAATGTAC GATAT 1045 348 aminoacids amino acid single linear peptide 31J2 21 Gly Leu Leu Gly Tyr TyrPhe Lys Gly Lys Asp Phe Asn Asn Leu Thr 1 5 10 15 Ile Phe Ala Pro ThrArg Glu Asn Thr Leu Ile Tyr Asp Leu Glu Thr 20 25 30 Ala Asn Ser Leu LeuAsp Lys Gln Gln Gln Thr Tyr Gln Ser Ile Arg 35 40 45 Trp Ile Gly Leu IleLys Ser Lys Lys Ala Gly Asp Phe Thr Phe Gln 50 55 60 Leu Ser Asp Asp GluHis Ala Ile Ile Glu Ile Asp Gly Lys Val Ile 65 70 75 80 Ser Gln Lys GlyGln Lys Lys Gln Val Val His Leu Glu Lys Asp Lys 85 90 95 Leu Val Pro IleLys Ile Glu Tyr Gln Ser Asp Lys Ala Leu Asn Pro 100 105 110 Asp Ser GlnMet Phe Lys Glu Leu Lys Leu Phe Lys Ile Asn Ser Gln 115 120 125 Lys GlnSer Gln Gln Val Gln Gln Asp Glu Leu Arg Asn Pro Glu Phe 130 135 140 GlyLys Glu Lys Thr Gln Thr Tyr Leu Lys Lys Ala Ser Lys Ser Ser 145 150 155160 Leu Phe Ser Asn Lys Ser Lys Arg Asp Ile Asp Glu Asp Ile Asp Glu 165170 175 Asp Thr Asp Thr Asp Gly Asp Ala Ile Pro Asp Val Trp Glu Glu Asn180 185 190 Gly Tyr Thr Ile Lys Gly Arg Val Ala Val Lys Trp Asp Glu GlyLeu 195 200 205 Ala Asp Lys Gly Tyr Lys Lys Phe Val Ser Asn Pro Phe ArgGln His 210 215 220 Thr Ala Gly Asp Pro Tyr Ser Asp Tyr Glu Lys Ala SerLys Asp Leu 225 230 235 240 Asp Leu Ser Asn Ala Lys Glu Thr Phe Asn ProLeu Val Ala Ala Phe 245 250 255 Pro Ser Val Asn Val Ser Leu Glu Asn ValThr Ile Ser Lys Asp Glu 260 265 270 Asn Lys Thr Ala Glu Ile Ala Ser ThrSer Ser Asn Asn Trp Ser Tyr 275 280 285 Thr Asn Thr Glu Gly Ala Ser IleGlu Ala Gly Ile Gly Pro Glu Gly 290 295 300 Leu Leu Ser Phe Gly Val SerAla Asn Tyr Gln His Ser Glu Thr Val 305 310 315 320 Ala Lys Glu Trp GlyThr Thr Lys Gly Asp Ala Thr Gln Tyr Asn Thr 325 330 335 Ala Ser Ala GlyTyr Leu Asn Ala Asn Val Arg Tyr 340 345 1641 base pairs nucleic acidsingle linear DNA (genomic) 33D2 22 CCAAAGGGGG NTTAAACCNG GANGGTTNNNTNNTTNNTTN TNGAANCCCA NTTGGAAACC 60 CNATNAAATT CNTGGTTANT GGTNGTGAGTGNNTNTTTTA NCNGAGNTTG CCCNTTTGNN 120 TACCNGGATT TNAAGGCAGA ANTTNTTNNTNGCTNNTTAA AGGTTNTGNT TNTNANTGAA 180 TTTTTTNGGN TTTGCCCAAA AAACAAGGATGAATCCTGTT ATTCCNCCCT NGAAAAAATN 240 GAAACGGAAC AACGTGAGTA TGATAAACATCTTTTACAAA CTGCGACATC TTGTTGAAAA 300 TGCCTTTTTT GAAAANNTAA AAGGTTTCGTGGCATTGCCA CACGTTATAC AAAAACCACG 360 TCTGCTTTTA GAGGGGCTGT TACCTTGGCTGCTATTTCTC TGTGGTTGAA TCTCGTATAG 420 ACACTATCTA GTCTATACAT CTTATCTTTTCATCATGATT CCAGTCGTAC ATTTACTCAA 480 AAATAGAAAG GATGACCCCT ATGCAATTAAAAAATGTATA CAAATGTTTA ACCATTACAG 540 CGCTTTTGGC TCAAATCGCC GCCTTCCCGTCTTCCTCTTT TGCGGAAGAC GGGAAGAAAA 600 AAGAAGAAAA TACAGCTAAA ACAGAACATCAACAGAAAAA AGAAACAAAA CCAGTTGTGG 660 GATTAATTGG TCACTATTTT ACTGATGATCAGTTTACTAA CACAGCATTT ATTCAAGTAG 720 GAGAAAAAAG TAAATTACTA GATTCAAAAATAGTAAAGCA AGATATGTCC AATTTGAAAT 780 CCATTCGATG GGAAGGAAAT GTGAAACCTCCTGAAACAGG AGAATATCTA CTTTCCACGT 840 CCTCTAATGA AAATGTTACA GTAAAAGTAGATGGAGAAAC TGTTATTAAC AAAGCTAACA 900 TGGAAAAAGC AATGAAACTC GAAAAAGATAAACCACACTC TATTGAAATT GAATATCATG 960 TTCCTGAGAA CGGGAAGGAA CTACAATTATTTTGGCAAAT AAATGACCAG AAAGCTGTTA 1020 AAATCCCAGA AAAAAACATA CTATCACCAAATCTTTCTGA ACAGATACAA CCGCAACAGC 1080 GTTCAACTCA ATCTCAACAA AATCAAAATGATAGGGATGG GGATAAAATC CCTGATAGTT 1140 TAGAAGAAAA TGGCTATACA TTTAAAGACGGTGCGATTGT TGCCTGGAAC GATTCCTATG 1200 CAGCACTAGG CTATAAAAAA TACATATCCAATTCTAATAA GGCTAAAACA GCTGCTGACC 1260 CCTATACGGA CTTTGAAAAA GTAACAGGACACATGCCGGA GGCAACTAAA GATGAAGTAA 1320 AAGATCCACT AGTAGCCGCT TATCCCTCGGTAGGTGTTGC TATGGAAAAA TTTCATTTTT 1380 CTAGAAATGA AACGGTCACT GAAGGAGACTCAGGTACTGT TTCAAAAACC GTAACCAATA 1440 CAAGCACAAC AACAAATAGC ATCGATGTTGGGGGATCCAT TGGATGGGGA GAAAAAGGAT 1500 TTTCTTTTTC ATTCTCTCCC AAATATACGCATTCTTGGAG TAATAGTACC GCTGTTGCTG 1560 ATACTGAAAG TAGCACATGG TCTTCACAATTAGCGTATAA TCCTTCAGAA CGTGCTTTCT 1620 TAAATGCCAA TATACGATAT A 1641 327amino acids amino acid single linear peptide 33D2 23 Gly Leu Ile Gly HisTyr Phe Thr Asp Asp Gln Phe Thr Asn Thr Ala 1 5 10 15 Phe Ile Gln ValGly Glu Lys Ser Lys Leu Leu Asp Ser Lys Ile Val 20 25 30 Lys Gln Asp MetSer Asn Leu Lys Ser Ile Arg Trp Glu Gly Asn Val 35 40 45 Lys Pro Pro GluThr Gly Glu Tyr Leu Leu Ser Thr Ser Ser Asn Glu 50 55 60 Asn Val Thr ValLys Val Asp Gly Glu Thr Val Ile Asn Lys Ala Asn 65 70 75 80 Met Glu LysAla Met Lys Leu Glu Lys Asp Lys Pro His Ser Ile Glu 85 90 95 Ile Glu TyrHis Val Pro Glu Asn Gly Lys Glu Leu Gln Leu Phe Trp 100 105 110 Gln IleAsn Asp Gln Lys Ala Val Lys Ile Pro Glu Lys Asn Ile Leu 115 120 125 SerPro Asn Leu Ser Glu Gln Ile Gln Pro Gln Gln Arg Ser Thr Gln 130 135 140Ser Gln Gln Asn Gln Asn Asp Arg Asp Gly Asp Lys Ile Pro Asp Ser 145 150155 160 Leu Glu Glu Asn Gly Tyr Thr Phe Lys Asp Gly Ala Ile Val Ala Trp165 170 175 Asn Asp Ser Tyr Ala Ala Leu Gly Tyr Lys Lys Tyr Ile Ser AsnSer 180 185 190 Asn Lys Ala Lys Thr Ala Ala Asp Pro Tyr Thr Asp Phe GluLys Val 195 200 205 Thr Gly His Met Pro Glu Ala Thr Lys Asp Glu Val LysAsp Pro Leu 210 215 220 Val Ala Ala Tyr Pro Ser Val Gly Val Ala Met GluLys Phe His Phe 225 230 235 240 Ser Arg Asn Glu Thr Val Thr Glu Gly AspSer Gly Thr Val Ser Lys 245 250 255 Thr Val Thr Asn Thr Ser Thr Thr ThrAsn Ser Ile Asp Val Gly Gly 260 265 270 Ser Ile Gly Trp Gly Glu Lys GlyPhe Ser Phe Ser Phe Ser Pro Lys 275 280 285 Tyr Thr His Ser Trp Ser AsnSer Thr Ala Val Ala Asp Thr Glu Ser 290 295 300 Ser Thr Trp Ser Ser GlnLeu Ala Tyr Asn Pro Ser Glu Arg Ala Phe 305 310 315 320 Leu Asn Ala AsnIle Arg Tyr 325 1042 base pairs nucleic acid single linear DNA (genomic)66D3 24 TTAATTGGGT ACTATTTTAA AGGAAAAGAT TTTAATAATC TTACTATATTTGCTCCAACA 60 CGTGAGAATA CTCTTATTTA TGATTTAGAA ACAGCGAATT CTTTATTAGATAAGCAACAA 120 CAAACCTATC AATCTATTCG TTGGATCGGT TTAATAAAAA GCAAAAAAGCTGGAGATTTT 180 ACCTTTCAAT TATCGGATGA TGAGCATGCT ATTATAGAAA TCGATGGGAAAGTTATTTCG 240 CAAAAAGGCC AAAAGAAACA AGTTGTTCAT TTAGAAAAAG ATAAATTAGTTCCCATCAAA 300 ATTGAATATC AATCTGATAA AGCGTTAAAC CCAGATAGTC AAATGTTTAAAGAATTGAAA 360 TTATTTAAAA TAAATAGTCA AAAACAATCT CAGCAAGTGC AACAAGACGAATTGAGAAAT 420 CCTGAATTTG GTAAAGAAAA AACTCAAACA TATTTAAAGA AAGCATCGAAAAGCAGCCTG 480 TTTAGCAATA AAAGTAAACG AGATATAGAT GAAGATATAG ATGAGGATACAGATACAGAT 540 GGAGATGCCA TTCCTGATGT ATGGGAAGAA AATGGGTATA CCATCAAAGGAAGAGTAGCT 600 GTTAAATGGG ACGAAGGATT AGCTGATAAG GGATATAAAA AGTTTGTTTCCAATCCTTTT 660 AGACAGCACA CTGCTGGTGA CCCCTATAGT GACTATGAAA AGGCATCAAAAGATTTGGAT 720 TTATCTAATG CAAAAGAAAC ATTTAATCCA TTGGTGGCTG CTTTTCCAAGTGTCAATGTT 780 AGCTTGGAAA ATGTCACCAT ATCAAAAGAT GAAAATAAAA CTGCTGAAATTGCGTCTACT 840 TCATCGAATA ATTGGTCCTA TACAAATACA GAGGGGGCAT CTATTGAAGCTGGAATTGGA 900 CCAGAAGGTT TGTTGTCTTT TGGAGTAAGT GCCAATTATC AACATTCTGAAACAGTGGCC 960 AAAGAGTGGG GTACAACTAA GGGAGACGCA ACACAATATA ATACAGCTTCAGCAGGATAT 1020 CTAAATGCCA ATGTACGATA TA 1042 347 amino acids amino acidsingle linear peptide 66D3 25 Leu Ile Gly Tyr Tyr Phe Lys Gly Lys AspPhe Asn Asn Leu Thr Ile 1 5 10 15 Phe Ala Pro Thr Arg Glu Asn Thr LeuIle Tyr Asp Leu Glu Thr Ala 20 25 30 Asn Ser Leu Leu Asp Lys Gln Gln GlnThr Tyr Gln Ser Ile Arg Trp 35 40 45 Ile Gly Leu Ile Lys Ser Lys Lys AlaGly Asp Phe Thr Phe Gln Leu 50 55 60 Ser Asp Asp Glu His Ala Ile Ile GluIle Asp Gly Lys Val Ile Ser 65 70 75 80 Gln Lys Gly Gln Lys Lys Gln ValVal His Leu Glu Lys Asp Lys Leu 85 90 95 Val Pro Ile Lys Ile Glu Tyr GlnSer Asp Lys Ala Leu Asn Pro Asp 100 105 110 Ser Gln Met Phe Lys Glu LeuLys Leu Phe Lys Ile Asn Ser Gln Lys 115 120 125 Gln Ser Gln Gln Val GlnGln Asp Glu Leu Arg Asn Pro Glu Phe Gly 130 135 140 Lys Glu Lys Thr GlnThr Tyr Leu Lys Lys Ala Ser Lys Ser Ser Leu 145 150 155 160 Phe Ser AsnLys Ser Lys Arg Asp Ile Asp Glu Asp Ile Asp Glu Asp 165 170 175 Thr AspThr Asp Gly Asp Ala Ile Pro Asp Val Trp Glu Glu Asn Gly 180 185 190 TyrThr Ile Lys Gly Arg Val Ala Val Lys Trp Asp Glu Gly Leu Ala 195 200 205Asp Lys Gly Tyr Lys Lys Phe Val Ser Asn Pro Phe Arg Gln His Thr 210 215220 Ala Gly Asp Pro Tyr Ser Asp Tyr Glu Lys Ala Ser Lys Asp Leu Asp 225230 235 240 Leu Ser Asn Ala Lys Glu Thr Phe Asn Pro Leu Val Ala Ala PhePro 245 250 255 Ser Val Asn Val Ser Leu Glu Asn Val Thr Ile Ser Lys AspGlu Asn 260 265 270 Lys Thr Ala Glu Ile Ala Ser Thr Ser Ser Asn Asn TrpSer Tyr Thr 275 280 285 Asn Thr Glu Gly Ala Ser Ile Glu Ala Gly Ile GlyPro Glu Gly Leu 290 295 300 Leu Ser Phe Gly Val Ser Ala Asn Tyr Gln HisSer Glu Thr Val Ala 305 310 315 320 Lys Glu Trp Gly Thr Thr Lys Gly AspAla Thr Gln Tyr Asn Thr Ala 325 330 335 Ser Ala Gly Tyr Leu Asn Ala AsnVal Arg Tyr 340 345 1278 base pairs nucleic acid single linear DNA(genomic) 68F 26 TGGATTACTT GGGTACTATT TTAAAGGGAA AGATTTTAAT GATCTTACTGTATTTGCACC 60 AACGCGTGGG AATACTCTTG TATATGATCA ACAAACAGCA AATACATTACTAAATCAAAA 120 ACAACAAGAC TTTCAGTCTA TTCGTTGGGT TGGTTTAATT CAAAGTAAAGAAGCAGGCGA 180 TTTTACATTT AACTTATCAG ATGATGAACA TACGATGATA GAAATCGATGGGAAAGTTAT 240 TTCTAATAAA GGGAAAGAAA AACAAGTTGT CCATTTAGAA AAAGGACAGTTCGTTTCTAT 300 CAAAATAGAA TATCAAGCTG ATGAACCATT TAATGCGGAT AGTCAAACCTTTAAAAATTT 360 GAAACTCTTT AAAGTAGATA CTAAGCAACA GTCCCAGCAA ATTCAACTAGATGAATTAAG 420 AAACCCTGAA TTTAATAAAA AAGAAACACA AGAATTTCTA ACAAAAGCAACAAAAACAAA 480 CCTTATTACT CAAAAAGTGA AGAGTACTAG GGATGAAGAC ACGGATACAGATGGAGATTC 540 TATTCCAGAC ATTTGGGAAG AAAATGGGTA TACCATCCAA AATAAGATTGCCGTCAAATG 600 GGATGATTCA TTAGCAAGTA AAGGATATAC GAAATTTGTT TCAAACCCACTAGATACTCA 660 CACGGTTGGA GATCCTTATA CAGATTATGA AAAAGCAGCA AGGGATTTAGATTTGTCAAA 720 TGCAAAAGAA ACATTTAACC CATTAGTTGC GGCTTTTCCA AGTGTGAATGTGAGTATGGA 780 AAAAGTGATA TTGTCTCCAG ATGAGAACTT ATCAAATAGT ATCGAGTCTCATTCATCTAC 840 GAATTGGTCG TATACGAATA CAGAAGGGGC TTCTATTGAA GCTGGTGGGGGAGCATTAGG 900 CCTATCTTTT GGTGTAAGTG CAAACTATCA ACATTCTGAA ACAGTTGGGTATGAATGGGG 960 AACATCTACG GGAAATACTT CGCAATTTAA TACAGCTTCA GCGGGGTATTTAAATGCGAA 1020 TGTTCGCTAC AATAACGTGG GAACGGGTGC AATCTATGAT GTAAAGCCAACAACGAGTTT 1080 TGTATTAAAT AAAGATACCA TCGCAACGAT AACAGCAAAA TCGAATACGACTGCATTAAG 1140 TATCTCACCA GGACAAAGTT ATCCGAAACA AGGTCAAAAT GGAATCGCGATCACATCGAT 1200 GGATGATTTT AACTCACATC CGATTACATT GAATAAGCAA CAGGTAGGTCAACTGTTAAA 1260 TAATACCCAA TTAATCCA 1278 425 amino acids amino acidsingle linear peptide 68F 27 Gly Leu Leu Gly Tyr Tyr Phe Lys Gly Lys AspPhe Asn Asp Leu Th 1 5 10 15 Val Phe Ala Pro Thr Arg Gly Asn Thr Leu ValTyr Asp Gln Gln Th 20 25 30 Ala Asn Thr Leu Leu Asn Gln Lys Gln Gln AspPhe Gln Ser Ile Ar 35 40 45 Trp Val Gly Leu Ile Gln Ser Lys Glu Ala GlyAsp Phe Thr Phe As 50 55 60 Leu Ser Asp Asp Glu His Thr Met Ile Glu IleAsp Gly Lys Val Il 65 70 75 80 Ser Asn Lys Gly Lys Glu Lys Gln Val ValHis Leu Glu Lys Gly Gl 85 90 95 Phe Val Ser Ile Lys Ile Glu Tyr Gln AlaAsp Glu Pro Phe Asn Al 100 105 110 Asp Ser Gln Thr Phe Lys Asn Leu LysLeu Phe Lys Val Asp Thr Ly 115 120 125 Gln Gln Ser Gln Gln Ile Gln LeuAsp Glu Leu Arg Asn Pro Glu Ph 130 135 140 Asn Lys Lys Glu Thr Gln GluPhe Leu Thr Lys Ala Thr Lys Thr As 145 150 155 160 Leu Ile Thr Gln LysVal Lys Ser Thr Arg Asp Glu Asp Thr Asp Th 165 170 175 Asp Gly Asp SerIle Pro Asp Ile Trp Glu Glu Asn Gly Tyr Thr Il 180 185 190 Gln Asn LysIle Ala Val Lys Trp Asp Asp Ser Leu Ala Ser Lys Gl 195 200 205 Tyr ThrLys Phe Val Ser Asn Pro Leu Asp Thr His Thr Val Gly As 210 215 220 ProTyr Thr Asp Tyr Glu Lys Ala Ala Arg Asp Leu Asp Leu Ser As 225 230 235240 Ala Lys Glu Thr Phe Asn Pro Leu Val Ala Ala Phe Pro Ser Val As 245250 255 Val Ser Met Glu Lys Val Ile Leu Ser Pro Asp Glu Asn Leu Ser As260 265 270 Ser Ile Glu Ser His Ser Ser Thr Asn Trp Ser Tyr Thr Asn ThrGl 275 280 285 Gly Ala Ser Ile Glu Ala Gly Gly Gly Ala Leu Gly Leu SerPhe Gl 290 295 300 Val Ser Ala Asn Tyr Gln His Ser Glu Thr Val Gly TyrGlu Trp Gl 305 310 315 320 Thr Ser Thr Gly Asn Thr Ser Gln Phe Asn ThrAla Ser Ala Gly Ty 325 330 335 Leu Asn Ala Asn Val Arg Tyr Asn Asn ValGly Thr Gly Ala Ile Ty 340 345 350 Asp Val Lys Pro Thr Thr Ser Phe ValLeu Asn Lys Asp Thr Ile Al 355 360 365 Thr Ile Thr Ala Lys Ser Asn ThrThr Ala Leu Ser Ile Ser Pro Gl 370 375 380 Gln Ser Tyr Pro Lys Gln GlyGln Asn Gly Ile Ala Ile Thr Ser Me 385 390 395 400 Asp Asp Phe Asn SerHis Pro Ile Thr Leu Asn Lys Gln Gln Val Gl 405 410 415 Gln Leu Leu AsnAsn Thr Gln Leu Ile 420 425 983 base pairs nucleic acid single linearDNA (genomic) 69AA2 28 TGGATTACTT GGGTACTATT TTACTGATGA TCAGTTTACTAACACAGCAT TTATTCAAGT 60 AGGAGAAAAA AGTAAATTAC TAGATTCAAA AATAGTAAAACAAGATATGT CCAATTTGAA 120 ATCCATTCGA TGGGAAGGAA ATGTGAAACC TCCTGAAACAGGAGAATATC TACTTTCCAC 180 GTCCTCTAAT GAAAATGTTA CAGTAAAAGT AGATGGAGAAACTGTTATTA ACAAAGCTAA 240 CATGGAAAAA GCAATGAAAC TCGAAAAAGA TAAACCACACTCTATTGAAA TTGAATATCA 300 TGTTCCTGAG AACGGGAAGG AACTACAATT ATTTTGGCAAATAAATGACC AGAAAGCTGT 360 TAAAATCCCA GAAAAAAACA TACTATCACC AAATCTTTCTGAACAGATAC AACCGCAACA 420 GCGTTCAACT CAATCTCAAC AAAATCAAAA TGATAGGGATGGGGATAAAA TCCCTGATAG 480 TTTAGAAGAA AATGGCTATA CATTTAAAGA CGGTGCGATTGTTGCCTGGA ACGATTCCTA 540 TGCAGCACTA GGCTATAAAA AATACATATC CAATTCTAATAAGGCTAAAA CAGCTGCTGA 600 CCCCTATACG GACTTTGAAA AAGTAACAGG ACACATGCCGGAGGCAACTA AAGATGAAGT 660 AAAAGATCCA CTAGTAGCCG CTTATCCCTC GGTAGGTGTTGCTATGGAAA AATTTCATTT 720 TTCTAGAAAT GAAACGGTCA CTGAAGGAGA CTCAGGTACTGTTTCAAAAA CCGTAACCAA 780 TACAAGCACA ACAACAAATA GCATCGATGT TGGGGGATCCATTGGATGGG GAGAAAAAGG 840 ATTTTCTTTT TCATTCTCTC CCAAATATAC GCATTCTTGGAGTAATAGTA CCGCTGTTGC 900 TGATACTGAA AGTAGCACAT GGTCTTCACA ATTAGCGTATAATCCTTCAG AACGTGCTNT 960 CTTAAATGCC AATAKACGAT NTA 983 327 amino acidsamino acid single linear peptide 69AA2 29 Gly Leu Leu Gly Tyr Tyr PheThr Asp Asp Gln Phe Thr Asn Thr Al 1 5 10 15 Phe Ile Gln Val Gly Glu LysSer Lys Leu Leu Asp Ser Lys Ile Va 20 25 30 Lys Gln Asp Met Ser Asn LeuLys Ser Ile Arg Trp Glu Gly Asn Va 35 40 45 Lys Pro Pro Glu Thr Gly GluTyr Leu Leu Ser Thr Ser Ser Asn Gl 50 55 60 Asn Val Thr Val Lys Val AspGly Glu Thr Val Ile Asn Lys Ala As 65 70 75 80 Met Glu Lys Ala Met LysLeu Glu Lys Asp Lys Pro His Ser Ile Gl 85 90 95 Ile Glu Tyr His Val ProGlu Asn Gly Lys Glu Leu Gln Leu Phe Tr 100 105 110 Gln Ile Asn Asp GlnLys Ala Val Lys Ile Pro Glu Lys Asn Ile Le 115 120 125 Ser Pro Asn LeuSer Glu Gln Ile Gln Pro Gln Gln Arg Ser Thr Gl 130 135 140 Ser Gln GlnAsn Gln Asn Asp Arg Asp Gly Asp Lys Ile Pro Asp Se 145 150 155 160 LeuGlu Glu Asn Gly Tyr Thr Phe Lys Asp Gly Ala Ile Val Ala Tr 165 170 175Asn Asp Ser Tyr Ala Ala Leu Gly Tyr Lys Lys Tyr Ile Ser Asn Se 180 185190 Asn Lys Ala Lys Thr Ala Ala Asp Pro Tyr Thr Asp Phe Glu Lys Va 195200 205 Thr Gly His Met Pro Glu Ala Thr Lys Asp Glu Val Lys Asp Pro Le210 215 220 Val Ala Ala Tyr Pro Ser Val Gly Val Ala Met Glu Lys Phe HisPh 225 230 235 240 Ser Arg Asn Glu Thr Val Thr Glu Gly Asp Ser Gly ThrVal Ser Ly 245 250 255 Thr Val Thr Asn Thr Ser Thr Thr Thr Asn Ser IleAsp Val Gly Gl 260 265 270 Ser Ile Gly Trp Gly Glu Lys Gly Phe Ser PheSer Phe Ser Pro Ly 275 280 285 Tyr Thr His Ser Trp Ser Asn Ser Thr AlaVal Ala Asp Thr Glu Se 290 295 300 Ser Thr Trp Ser Ser Gln Leu Ala TyrAsn Pro Ser Glu Arg Ala Xa 305 310 315 320 Leu Asn Ala Asn Xaa Arg Xaa325 1075 base pairs nucleic acid single linear DNA (genomic) 168G1 30TGGGTTAATT GGATATTATT TCCAGGATCA AAAATTTCAA CAACTCGCTT TAATGGTACA 60TAGGCAAGCT TCTGATTTAA AAATACTGAA AGATGACGTG AAACATTTAC TATCCGAAGA 120TCAACAACAC ATTCAATCAG TAAGGTGGAT AGGCTATATT AAGCCACCTA AAACAGGAGA 180CTACGTATTG TCAACCTCAT CCGACCAACA GGTCATGATT GAACTAGATG GTAAAGTCAT 240TCTCAATCAG GCTTCTATGA CAGAACCTGT TCAACTTGAA AAAGATAAAC CGTATAAAAT 300TAAAATTGAA TATGTTCCGG AACAAACAGA AACACAAGAT ACGCTTCTTG ATTTTAAACT 360GAACTGGTCT TTTTCAGGCG GAAAAACAGA AACGATTCCA GAAAATGCAT TTCTATTACC 420AGACCTTTCT CGTAAACAAG ATCAAGAAAA GCTTATTCCT GAGGCAAGTT TATTTCAGAA 480ACCTGGAGAC GAGAAAAAAA TATCTCGAAG TAAACGGTCC TTTAACTACA GATTCTCTAT 540ATGATACAAG ATGATGATGG GATTTCGGAT GCGTGGGAAA CAGAAGGATA CACGATACAA 600AGACAACTGG CAGTGAAATG GGACGATTCT ATGAAGGATC GAGGGTATAC CAAATATGTA 660TCTAATCCCT ATAATTCCCA TACAGTAGGG GATCCATACA CAGATTGGGA AAAAGCGGCT 720GGACGTATTG ATAAGGCGAT CAAAGGAGAA GCTAGGAATC CTTTAGTCGC GGCCTATCCA 780ACCGTTGGTG TACATATGGA AAAACTGATT GTCTCCGAGA AACAAAACAT ATCAACTGGA 840CTCGGAAAAA CAATATCTGC GTCAATGTCT GCAAGTAATA CCGCAGCGAT TACAGCGGGC 900ATTGATACGA CGGCTGGTGC TTCTTTACTT GGACCGTCTG GAAGCGTCAC GGCTCATTTT 960TCTGATACAG GATCCAGTAC ATCCACTGTT GAAAATAGCT CAAGTAATAA TTGGAGTCAA 1020GATCTTGGAA TCGATACGGG ACAATCTGCA TATTTAAATG CCAATGTACG ATATA 1075 2645base pairs nucleic acid single linear DNA (genomic) PS177C8 31ATGAAGAAGA AGTTAGCAAG TGTTGTAACG TGTACGTTAT TAGCTCCTAT GTTTTTGAAT 60GGAAATGTGA ATGCTGTTTA CGCAGACAGC AAAACAAATC AAATTTCTAC AACACAGAAA 120AATCAACAGA AAGAGATGGA CCGAAAAGGA TTACTTGGGT ATTATTTCAA AGGAAAAGAT 180TTTAGTAATC TTACTATGTT TGCACCGACA CGTGATAGTA CTCTTATTTA TGATCAACAA 240ACAGCAAATA AACTATTAGA TAAAAAACAA CAAGAATATC AGTCTATTCG TTGGATTGGT 300TTGATTCAGA GTAAAGAAAC GGGAGATTTC ACATTTAACT TATCTGAGGA TGAACAGGCA 360ATTATAGAAA TCAATGGGAA AATTATTTCT AATAAAGGGA AAGAAAAGCA AGTTGTCCAT 420TTAGAAAAAG GAAAATTAGT TCCAATCAAA ATAGAGTATC AATCAGATAC AAAATTTAAT 480ATTGACAGTA AAACATTTAA AGAACTTAAA TTATTTAAAA TAGATAGTCA AAACCAACCC 540CAGCAAGTCC AGCAAGATGA ACTGAGAAAT CCTGAATTTA ACAAGAAAGA ATCACAGGAA 600TTCTTAGCGA AACCATCGAA AATAAATCTT TTCACTCAAA AAATGAAAAG GGAAATTGAT 660GAAGACACGG ATACGGATGG GGACTCTATT CCTGACCTTT GGGAAGAAAA TGGGTATACG 720ATTCAAAATA GAATCGCTGT AAAGTGGGAC GATTCTYTAG CAAGTAAAGG GTATACGAAA 780TTTGTTTCAA ATCCGCTAGA AAGTCACACA GTTGGTGATC CTTATACAGA TTATGAAAAG 840GCAGCAAGAG ACCTAGATTT GTCAAATGCA AAGGAAACGT TTAACCCATT GGTAGCTGCT 900TTTCCAAGTG TGAATGTTAG TATGGAAAAG GTGATATTAT CACCAAATGA AAATTTATCC 960AATAGTGTAG AGTCTCATTC ATCCACGAAT TGGTCTTATA CAAATACAGA AGGTGCTTCT 1020GTTGAAGCGG GGATTGGACC AAAAGGTATT TCGTTCGGAG TTAGCGTAAA CTATCAACAC 1080TCTGAAACAG TTGCACAAGA ATGGGGAACA TCTACAGGAA ATACTTCGCA ATTCAATACG 1140GCTTCAGCGG GATATTTAAA TGCAAATGTT CGATATAACA ATGTAGGAAC TGGTGCCATC 1200TACGATGTAA AACCTACAAC AAGTTTTGTA TTAAATAACG ATACTATCGC AACTATTACG 1260GCGAAATCTA ATTCTACAGC CTTAAATATA TCTCCTGGAG AAAGTTACCC GAAAAAAGGA 1320CAAAATGGAA TCGCAATAAC ATCAATGGAT GATTTTAATT CCCATCCGAT TACATTAAAT 1380AAAAAACAAG TAGATAATCT GCTAAATAAT AAACCTATGA TGTTGGAAAC AAACCAAACA 1440GATGGTGTTT ATAAGATAAA AGATACACAT GGAAATATAG TAACTGGCGG AGAATGGAAT 1500GGTGTCATAC AACAAATCAA GGCTAAAACA GCGTCTATTA TTGTGGATGA TGGGGAACGT 1560GTAGCAGAAA AACGTGTAGC GGCAAAAGAT TATGAAAATC CAGAAGATAA AACACCGTCT 1620TTAACTTTAA AAGATGCCCT GAAGCTTTCA TATCCAGATG AAATAAAAGA AATAGAGGGA 1680TTATTATATT ATAAAAACAA ACCGATATAC GAATCGAGCG TTATGACTTA CTTAGATGAA 1740AATACAGCAA AAGAAGTGAC CAAACAATTA AATGATACCA CTGGGAAATT TAAAGATGTA 1800AGTCATTTAT ATGATGTAAA ACTGACTCCA AAAATGAATG TTACAATCAA ATTGTCTATA 1860CTTTATGATA ATGCTGAGTC TAATGATAAC TCAATTGGTA AATGGACAAA CACAAATATT 1920GTTTCAGGTG GAAATAACGG AAAAAAACAA TATTCTTCTA ATAATCCGGA TGCTAATTTG 1980ACATTAAATA CAGATGCTCA AGAAAAATTA AATAAAAATC GTACTATTAT ATAAGTTTAT 2040ATATGAAGTC AGAAAAAAAC ACACAATGTG AGATTACTAT AGATGGGGAG ATTTATCCGA 2100TCACTACAAA AACAGTGAAT GTGAATAAAG ACAATTACAA AAGATTAGAT ATTATAGCTC 2160ATAATATAAA AAGTAATCCA ATTTCTTCAA TTCATATTAA AACGAATGAT GAAATAACTT 2220TATTTTGGGA TGATATTTCT ATAACAGATG TAGCATCAAT AAAACCGGAA AATTTAACAG 2280ATTCAGAAAT TAAACAGATT TATAGTAGGT ATGGTATTAA GTTAGAAGAT GGAATCCTTA 2340TTGATAAAAA AGGTGGGATT CATTATGGTG AATTTATTAA TGAAGCTAGT TTTAATATTG 2400AACCATTGCA AAATTATGTG ACAAAATATA AAGTTACTTA TAGTAGTGAG TTAGGACAAA 2460ACGTGAGTGA CACACTTGAA AGTGATAAAA TTTACAAGGA TGGGACAATT AAATTTGATT 2520TTACAAAATA TAGTRAAAAT GAACAAGGAT TATTTTATGA CAGTGGATTA AATTGGGACT 2580TTAAAATTAA TGCTATTACT TATGATGGTA AAGAGATGAA TGTTTTTCAT AGATATAATA 2640AATAG 2645 881 amino acids amino acid single linear peptide PS177C8 32Met Lys Lys Lys Leu Ala Ser Val Val Thr Cys Thr Leu Leu Ala Pr 1 5 10 15Met Phe Leu Asn Gly Asn Val Asn Ala Val Tyr Ala Asp Ser Lys Th 20 25 30Asn Gln Ile Ser Thr Thr Gln Lys Asn Gln Gln Lys Glu Met Asp Ar 35 40 45Lys Gly Leu Leu Gly Tyr Tyr Phe Lys Gly Lys Asp Phe Ser Asn Le 50 55 60Thr Met Phe Ala Pro Thr Arg Asp Ser Thr Leu Ile Tyr Asp Gln Gl 65 70 7580 Thr Ala Asn Lys Leu Leu Asp Lys Lys Gln Gln Glu Tyr Gln Ser Il 85 9095 Arg Trp Ile Gly Leu Ile Gln Ser Lys Glu Thr Gly Asp Phe Thr Ph 100105 110 Asn Leu Ser Glu Asp Glu Gln Ala Ile Ile Glu Ile Asn Gly Lys Il115 120 125 Ile Ser Asn Lys Gly Lys Glu Lys Gln Val Val His Leu Glu LysGl 130 135 140 Lys Leu Val Pro Ile Lys Ile Glu Tyr Gln Ser Asp Thr LysPhe As 145 150 155 160 Ile Asp Ser Lys Thr Phe Lys Glu Leu Lys Leu PheLys Ile Asp Se 165 170 175 Gln Asn Gln Pro Gln Gln Val Gln Gln Asp GluLeu Arg Asn Pro Gl 180 185 190 Phe Asn Lys Lys Glu Ser Gln Glu Phe LeuAla Lys Pro Ser Lys Il 195 200 205 Asn Leu Phe Thr Gln Lys Met Lys ArgGlu Ile Asp Glu Asp Thr As 210 215 220 Thr Asp Gly Asp Ser Ile Pro AspLeu Trp Glu Glu Asn Gly Tyr Th 225 230 235 240 Ile Gln Asn Arg Ile AlaVal Lys Trp Asp Asp Ser Leu Ala Ser Ly 245 250 255 Gly Tyr Thr Lys PheVal Ser Asn Pro Leu Glu Ser His Thr Val Gl 260 265 270 Asp Pro Tyr ThrAsp Tyr Glu Lys Ala Ala Arg Asp Leu Asp Leu Se 275 280 285 Asn Ala LysGlu Thr Phe Asn Pro Leu Val Ala Ala Phe Pro Ser Va 290 295 300 Asn ValSer Met Glu Lys Val Ile Leu Ser Pro Asn Glu Asn Leu Se 305 310 315 320Asn Ser Val Glu Ser His Ser Ser Thr Asn Trp Ser Tyr Thr Asn Th 325 330335 Glu Gly Ala Ser Val Glu Ala Gly Ile Gly Pro Lys Gly Ile Ser Ph 340345 350 Gly Val Ser Val Asn Tyr Gln His Ser Glu Thr Val Ala Gln Glu Tr355 360 365 Gly Thr Ser Thr Gly Asn Thr Ser Gln Phe Asn Thr Ala Ser AlaGl 370 375 380 Tyr Leu Asn Ala Asn Val Arg Tyr Asn Asn Val Gly Thr GlyAla Il 385 390 395 400 Tyr Asp Val Lys Pro Thr Thr Ser Phe Val Leu AsnAsn Asp Thr Il 405 410 415 Ala Thr Ile Thr Ala Lys Ser Asn Ser Thr AlaLeu Asn Ile Ser Pr 420 425 430 Gly Glu Ser Tyr Pro Lys Lys Gly Gln AsnGly Ile Ala Ile Thr Se 435 440 445 Met Asp Asp Phe Asn Ser His Pro IleThr Leu Asn Lys Lys Gln Va 450 455 460 Asp Asn Leu Leu Asn Asn Lys ProMet Met Leu Glu Thr Asn Gln Th 465 470 475 480 Asp Gly Val Tyr Lys IleLys Asp Thr His Gly Asn Ile Val Thr Gl 485 490 495 Gly Glu Trp Asn GlyVal Ile Gln Gln Ile Lys Ala Lys Thr Ala Se 500 505 510 Ile Ile Val AspAsp Gly Glu Arg Val Ala Glu Lys Arg Val Ala Al 515 520 525 Lys Asp TyrGlu Asn Pro Glu Asp Lys Thr Pro Ser Leu Thr Leu Ly 530 535 540 Asp AlaLeu Lys Leu Ser Tyr Pro Asp Glu Ile Lys Glu Ile Glu Gl 545 550 555 560Leu Leu Tyr Tyr Lys Asn Lys Pro Ile Tyr Glu Ser Ser Val Met Th 565 570575 Tyr Leu Asp Glu Asn Thr Ala Lys Glu Val Thr Lys Gln Leu Asn As 580585 590 Thr Thr Gly Lys Phe Lys Asp Val Ser His Leu Tyr Asp Val Lys Le595 600 605 Thr Pro Lys Met Asn Val Thr Ile Lys Leu Ser Ile Leu Tyr AspAs 610 615 620 Ala Glu Ser Asn Asp Asn Ser Ile Gly Lys Trp Thr Asn ThrAsn Il 625 630 635 640 Val Ser Gly Gly Asn Asn Gly Lys Lys Gln Tyr SerSer Asn Asn Pr 645 650 655 Asp Ala Asn Leu Thr Leu Asn Thr Asp Ala GlnGlu Lys Leu Asn Ly 660 665 670 Asn Arg Asp Tyr Tyr Ile Ser Leu Tyr MetLys Ser Glu Lys Asn Th 675 680 685 Gln Cys Glu Ile Thr Ile Asp Gly GluIle Tyr Pro Ile Thr Thr Ly 690 695 700 Thr Val Asn Val Asn Lys Asp AsnTyr Lys Arg Leu Asp Ile Ile Al 705 710 715 720 His Asn Ile Lys Ser AsnPro Ile Ser Ser Ile His Ile Lys Thr As 725 730 735 Asp Glu Ile Thr LeuPhe Trp Asp Asp Ile Ser Ile Thr Asp Val Al 740 745 750 Ser Ile Lys ProGlu Asn Leu Thr Asp Ser Glu Ile Lys Gln Ile Ty 755 760 765 Ser Arg TyrGly Ile Lys Leu Glu Asp Gly Ile Leu Ile Asp Lys Ly 770 775 780 Gly GlyIle His Tyr Gly Glu Phe Ile Asn Glu Ala Ser Phe Asn Il 785 790 795 800Glu Pro Leu Gln Asn Tyr Val Thr Lys Tyr Lys Val Thr Tyr Ser Se 805 810815 Glu Leu Gly Gln Asn Val Ser Asp Thr Leu Glu Ser Asp Lys Ile Ty 820825 830 Lys Asp Gly Thr Ile Lys Phe Asp Phe Thr Lys Tyr Ser Xaa Asn Gl835 840 845 Gln Gly Leu Phe Tyr Asp Ser Gly Leu Asn Trp Asp Phe Lys IleAs 850 855 860 Ala Ile Thr Tyr Asp Gly Lys Glu Met Asn Val Phe His ArgTyr As 865 870 875 880 Lys 1022 base pairs nucleic acid single linearDNA (genomic) 177I8 33 TGGATTAATT GGGTATTATT TCAAAGGAAA AGATTTTAATAATCTTACTA TGTTTGCACC 60 GACACGTGAT AATACCCTTA TGTATGACCA ACAAACAGCGAATGCATTAT TAGATAAAAA 120 ACAACAAGAA TATCAGTCCA TTCGTTGGAT TGGTTTGATTCAGAGTAAAG AAACGGGCGA 180 TTTCACATTT AACTTATCAA AGGATGAACA GGCAATTATAGAAATCGATG GGAAAATCAT 240 TTCTAATAAA GGGAAAGAAA AGCAAGTTGT CCATTTAGAAAAAGAAAAAT TAGTTCCAAT 300 CAAAATAGAG TATCAATCAG ATACGAAATT TAATATTGATAGTAAAACAT TTAAAGAACT 360 TAAATTATTT AAAATAGATA GTCAAAACCA ATCTCAACAAGTTCAACTGA GAAACCCTGA 420 ATTTAACAAA AAAGAATCAC AGGAATTTTT AGCAAAAGCATCAAAAACAA ACCTTTTTAA 480 GCAAAAAATG AAAAGAGATA TTGATGAAGA TACGGATACAGATGGAGACT CCATTCCTGA 540 TCTTTGGGAA GAAAATGGGT ACACGATTCA AAATAAAGTTGCTGTCAAAT GGGATGATTC 600 GCTAGCAAGT AAGGGATATA CAAAATTTGT TTCGAATCCATTAGACAGCC ACACAGTTGG 660 CGATCCCTAT ACTGATTATG AAAAGGCCGC AAGGGATTTAGATTTATCAA ATGCAAAGGA 720 AACGTTCAAC CCATTGGTAG CTGCTTTYCC AAGTGTGAATGTTAGTATGG AAAAGGTGAT 780 ATTATCACCA AATGAAAATT TATCCAATAG TGTAGAGTCTCATTCATCCA CGAATTGGTC 840 TTATACGAAT ACAGAAGGAG CTTCCATTGA AGCTGGTGGCGGTCCATTAG GCCTTTCTTT 900 TGGAGTGAGT GTTAATTATC AACACTCTGA AACAGTTGCACAAGAATGGG GAACATCTAC 960 AGGAAATACT TCACAATTCA ATACGGCTTC AGCGGGATATTTAAATGCCA ATATACGATA 1020 TA 1022 340 amino acids amino acid singlelinear peptide 177I8 34 Gly Leu Ile Gly Tyr Tyr Phe Lys Gly Lys Asp PheAsn Asn Leu Th 1 5 10 15 Met Phe Ala Pro Thr Arg Asp Asn Thr Leu Met TyrAsp Gln Gln Th 20 25 30 Ala Asn Ala Leu Leu Asp Lys Lys Gln Gln Glu TyrGln Ser Ile Ar 35 40 45 Trp Ile Gly Leu Ile Gln Ser Lys Glu Thr Gly AspPhe Thr Phe As 50 55 60 Leu Ser Lys Asp Glu Gln Ala Ile Ile Glu Ile AspGly Lys Ile Il 65 70 75 80 Ser Asn Lys Gly Lys Glu Lys Gln Val Val HisLeu Glu Lys Glu Ly 85 90 95 Leu Val Pro Ile Lys Ile Glu Tyr Gln Ser AspThr Lys Phe Asn Il 100 105 110 Asp Ser Lys Thr Phe Lys Glu Leu Lys LeuPhe Lys Ile Asp Ser Gl 115 120 125 Asn Gln Ser Gln Gln Val Gln Leu ArgAsn Pro Glu Phe Asn Lys Ly 130 135 140 Glu Ser Gln Glu Phe Leu Ala LysAla Ser Lys Thr Asn Leu Phe Ly 145 150 155 160 Gln Lys Met Lys Arg AspIle Asp Glu Asp Thr Asp Thr Asp Gly As 165 170 175 Ser Ile Pro Asp LeuTrp Glu Glu Asn Gly Tyr Thr Ile Gln Asn Ly 180 185 190 Val Ala Val LysTrp Asp Asp Ser Leu Ala Ser Lys Gly Tyr Thr Ly 195 200 205 Phe Val SerAsn Pro Leu Asp Ser His Thr Val Gly Asp Pro Tyr Th 210 215 220 Asp TyrGlu Lys Ala Ala Arg Asp Leu Asp Leu Ser Asn Ala Lys Gl 225 230 235 240Thr Phe Asn Pro Leu Val Ala Ala Xaa Pro Ser Val Asn Val Ser Me 245 250255 Glu Lys Val Ile Leu Ser Pro Asn Glu Asn Leu Ser Asn Ser Val Gl 260265 270 Ser His Ser Ser Thr Asn Trp Ser Tyr Thr Asn Thr Glu Gly Ala Se275 280 285 Ile Glu Ala Gly Gly Gly Pro Leu Gly Leu Ser Phe Gly Val SerVa 290 295 300 Asn Tyr Gln His Ser Glu Thr Val Ala Gln Glu Trp Gly ThrSer Th 305 310 315 320 Gly Asn Thr Ser Gln Phe Asn Thr Ala Ser Ala GlyTyr Leu Asn Al 325 330 335 Asn Ile Arg Tyr 340 1073 base pairs nucleicacid single linear DNA (genomic) 185AA2 35 TGGATTAATT GGGTATTATTTCCAGGAGCA AAACTTTGAG AAACCCGCTT TGATAGCAAA 60 TAGACAAGCT TCTGATTTGGAAATACCGAA AGATGACGTG AAAGAGTTAC TATCCAAAGA 120 ACAGCAACAC ATTCAATCTGTTAGATGGCT TGGCTATATT CAGCCACCTC AAACAGGAGA 180 CTATGTATTG TCAACCTCATCCGACCAACA GGTCGTGATT GAACTCGATG GAAAAACCAT 240 TGTCAATCAA ACTTCTATGACAGAACCGAT TCAACTAGAA AAAGATAAAC GCTATAAAAT 300 TAGAATTGAA TATGTCCCAGGAGATACACA AGGACAAGAG AACCTTCTGG ACTTTCAACT 360 GAAGTGGTCA ATTTCAGGAGCCGAGATAGA ACCAATTCCG GATCATGCTT TCCATTTACC 420 AGATTTTTCT CATAAACAAGATCAAGAGAA AATCATCCCT GAAACCAATT TATTTCAGAA 480 ACAAGGAGAT GAGAAAAAAGTATCACGCAG TAAGAGATCT TCAGATAAAG ATCCTGACCG 540 TGATACAGAT GATGATAGTATTTCTGATGA ATGGGAAACG AGTGGATATA CCATTCAAAG 600 ACAGGTGGCA GTGAAATGGGACGATTCTAT GAAGGAGCTA GGTTATACCA AGTATGTGTC 660 TAACCCTTAT AAGTCTCGTACAGTAGGAGA TCCATACACA GATTGGGAAA AAGCGGCTGG 720 CAGTATCGAT AATGCTGTCAAAGCAGAAGC CAGAAATCCT TTAGTCGCGG CCTATCCAAC 780 TGTTGGTGTA CATATGGAAAGATTAATTGT CTCCGAACAA CAAAATATAT CAACAGGGCT 840 TGGAAAAACC GTATCTGCGTCTACGTCCGC AAGCAATACC GCAGCGATTA CGGCAGGTAT 900 TGATGCAACA GCTGGTGCCTCTTTACTTGG GCCATCTGGA AGTGTCACGG CTCATTTTTC 960 TTACACGGGA TCTAGTACAGCCACCATTGA AGATAGCTCC AGCCGTAATT GGAGTCGAGA 1020 CCTTGGGATT GATACGGGACAAGCTGCATA TTTAAATGCC AATATACGAT ATA 1073 357 amino acids amino acidsingle linear peptide 185AA2 36 Gly Leu Ile Gly Tyr Tyr Phe Gln Glu GlnAsn Phe Glu Lys Pro Al 1 5 10 15 Leu Ile Ala Asn Arg Gln Ala Ser Asp LeuGlu Ile Pro Lys Asp As 20 25 30 Val Lys Glu Leu Leu Ser Lys Glu Gln GlnHis Ile Gln Ser Val Ar 35 40 45 Trp Leu Gly Tyr Ile Gln Pro Pro Gln ThrGly Asp Tyr Val Leu Se 50 55 60 Thr Ser Ser Asp Gln Gln Val Val Ile GluLeu Asp Gly Lys Thr Il 65 70 75 80 Val Asn Gln Thr Ser Met Thr Glu ProIle Gln Leu Glu Lys Asp Ly 85 90 95 Arg Tyr Lys Ile Arg Ile Glu Tyr ValPro Gly Asp Thr Gln Gly Gl 100 105 110 Glu Asn Leu Leu Asp Phe Gln LeuLys Trp Ser Ile Ser Gly Ala Gl 115 120 125 Ile Glu Pro Ile Pro Asp HisAla Phe His Leu Pro Asp Phe Ser Hi 130 135 140 Lys Gln Asp Gln Glu LysIle Ile Pro Glu Thr Asn Leu Phe Gln Ly 145 150 155 160 Gln Gly Asp GluLys Lys Val Ser Arg Ser Lys Arg Ser Ser Asp Ly 165 170 175 Asp Pro AspArg Asp Thr Asp Asp Asp Ser Ile Ser Asp Glu Trp Gl 180 185 190 Thr SerGly Tyr Thr Ile Gln Arg Gln Val Ala Val Lys Trp Asp As 195 200 205 SerMet Lys Glu Leu Gly Tyr Thr Lys Tyr Val Ser Asn Pro Tyr Ly 210 215 220Ser Arg Thr Val Gly Asp Pro Tyr Thr Asp Trp Glu Lys Ala Ala Gl 225 230235 240 Ser Ile Asp Asn Ala Val Lys Ala Glu Ala Arg Asn Pro Leu Val Al245 250 255 Ala Tyr Pro Thr Val Gly Val His Met Glu Arg Leu Ile Val SerGl 260 265 270 Gln Gln Asn Ile Ser Thr Gly Leu Gly Lys Thr Val Ser AlaSer Th 275 280 285 Ser Ala Ser Asn Thr Ala Ala Ile Thr Ala Gly Ile AspAla Thr Al 290 295 300 Gly Ala Ser Leu Leu Gly Pro Ser Gly Ser Val ThrAla His Phe Se 305 310 315 320 Tyr Thr Gly Ser Ser Thr Ala Thr Ile GluAsp Ser Ser Ser Arg As 325 330 335 Trp Ser Arg Asp Leu Gly Ile Asp ThrGly Gln Ala Ala Tyr Leu As 340 345 350 Ala Asn Ile Arg Tyr 355 1073 basepairs nucleic acid single linear DNA (genomic) 196F3 37 TGGGTTACNTGGGTATTAYT TTCAGGATAC TAAATTTCAA CAACTTGCTT TAATGGCACA 60 TAGACAAGCCTCAGATTTAG AAATAAACAA AAATGAMGTC AAGGATTTAC TATCAAAGGA 120 TCAACAACACATTCAAGCAG TGAGATGGAT GGGCTATATT CAGCCACCTC AAACAGGAGA 180 TTATGTATTGTCAACTTCAT CCGACCAACA GGTCTTCACC GAACTCNATG GAAAAATAAT 240 TCTCAATCAATCTTCTATGA CCGAACCCAT TCGATTAGAA AAAGATAAAC AATATAMAAT 300 TAGAATTGAATATGTATCAK AAAGTAAAAC AGAAAAAGAG ACGCTCCTAG ACTTTCAACT 360 CAACTGGTCGATTTCAGGTG CTACGGTAGA ACCAATTCCA GATAATGCTT TTCAGTTACC 420 AGATCTTTCTCGGGAACAAG NTAAAGATAA AATCATCCCT GAAACAAGTT TATTGCAGGA 480 TCAAGGAGAAGGGAAACAAG TATCTCGAAG TAAAAGATCT CTAGCTGTGA ATCCTCTACA 540 CGATACAGATGATGATGGGA TTTACGATGA ATGGGAAACA AGCGGCTATA CGATTCAAAG 600 ACAATTGGCAGTAAGATGGA ACGATTCTAT GAAGGATCAA GGCTATACCA AATATGTGTC 660 TAATCCTTATAAGTCTCATA CTGTAGGAGA TCCATACACA GACTGGGAAA AAGCAGCTGG 720 ACGTATCGACCAAGCTGTGA AAATAGAAGC CAGAAACCCA TTAGTTGCAG CATATCCAAC 780 AGTTGGCGTACATATGGAAA GACTGATTGT CTCTGAAAAA CAAAATATAG CAACAGGACT 840 GGGAAAAACAGTATCTGCGT CTACATCTGC AAGTAATACA GCGGGGATTA CAGCGGGAAT 900 CGATGCAACGGTTGGTGCCT CTTTACTTGG ACCTTCGGGA AGTGTCACCG CCCATTTTTC 960 TTATACGGGTTCGAGTACAT CCACTGTTGA AAATAGCTCG AGTAATAATT GGAGTCAAGA 1020 TCTTGGTATTGATACCAGCC AATCTGCGTA CTTAAATGCC AATGTAAGAT ATA 1073 357 amino acidsamino acid single linear peptide 196F3 38 Gly Leu Xaa Gly Tyr Xaa PheGln Asp Thr Lys Phe Gln Gln Leu Al 1 5 10 15 Leu Met Ala His Arg Gln AlaSer Asp Leu Glu Ile Asn Lys Asn Xa 20 25 30 Val Lys Asp Leu Leu Ser LysAsp Gln Gln His Ile Gln Ala Val Ar 35 40 45 Trp Met Gly Tyr Ile Gln ProPro Gln Thr Gly Asp Tyr Val Leu Se 50 55 60 Thr Ser Ser Asp Gln Gln ValPhe Thr Glu Leu Xaa Gly Lys Ile Il 65 70 75 80 Leu Asn Gln Ser Ser MetThr Glu Pro Ile Arg Leu Glu Lys Asp Ly 85 90 95 Gln Tyr Xaa Ile Arg IleGlu Tyr Val Ser Xaa Ser Lys Thr Glu Ly 100 105 110 Glu Thr Leu Leu AspPhe Gln Leu Asn Trp Ser Ile Ser Gly Ala Th 115 120 125 Val Glu Pro IlePro Asp Asn Ala Phe Gln Leu Pro Asp Leu Ser Ar 130 135 140 Glu Gln XaaLys Asp Lys Ile Ile Pro Glu Thr Ser Leu Leu Gln As 145 150 155 160 GlnGly Glu Gly Lys Gln Val Ser Arg Ser Lys Arg Ser Leu Ala Va 165 170 175Asn Pro Leu His Asp Thr Asp Asp Asp Gly Ile Tyr Asp Glu Trp Gl 180 185190 Thr Ser Gly Tyr Thr Ile Gln Arg Gln Leu Ala Val Arg Trp Asn As 195200 205 Ser Met Lys Asp Gln Gly Tyr Thr Lys Tyr Val Ser Asn Pro Tyr Ly210 215 220 Ser His Thr Val Gly Asp Pro Tyr Thr Asp Trp Glu Lys Ala AlaGl 225 230 235 240 Arg Ile Asp Gln Ala Val Lys Ile Glu Ala Arg Asn ProLeu Val Al 245 250 255 Ala Tyr Pro Thr Val Gly Val His Met Glu Arg LeuIle Val Ser Gl 260 265 270 Lys Gln Asn Ile Ala Thr Gly Leu Gly Lys ThrVal Ser Ala Ser Th 275 280 285 Ser Ala Ser Asn Thr Ala Gly Ile Thr AlaGly Ile Asp Ala Thr Va 290 295 300 Gly Ala Ser Leu Leu Gly Pro Ser GlySer Val Thr Ala His Phe Se 305 310 315 320 Tyr Thr Gly Ser Ser Thr SerThr Val Glu Asn Ser Ser Ser Asn As 325 330 335 Trp Ser Gln Asp Leu GlyIle Asp Thr Ser Gln Ser Ala Tyr Leu As 340 345 350 Ala Asn Val Arg Tyr355 1073 base pairs nucleic acid single linear DNA (genomic) 196J4 39TGGGTTAATT GGGTATTATT TCCAGGATCA AAAGTTTCAA CAACTTGCTT TAATGGCACAB 60TAGACAAGCT TCTAATTTAA ACATACCAAA AAATGAAGTG AAACAGTTAT TATCCGAAGA 120TCAACAACAT ATTCAATCCG TTAGGTGGAT CGGATATATC AAATCACCTC AAACGGGAGA 180TTATATATTG TCAACTTCAG CCGATCGACA TGTCGTAATT GAACTTGACG GAAAAACCAT 240TCTTAATCAA TCTTCTATGA CAGCACCCAT TCAATTAGAA AAAGATAAAC TTTATAAAAT 300TAGAATTGAA TATGTCCCAG AAGATACAAA AGGACAGGAA AACCTCTTTG ACTTTCAACT 360GAATTGGTCA ATTTCAGGAG ATAAGGTAGA ACCAATTCCG GAGAATGCAT TTCTGTTGCC 420AGACTTTTCT CATAAACAAG ATCAAGAGAA AATCATCCCT GAAGCAAGTT TATTCCAGGA 480ACAAGAAGAT GCAAACAAAG TCTCTCGAAA TAAACGATCC ATAGCTACAG GTTCTCTGTA 540TGATACAGAT GATGATGCTA TTTATGATGA ATGGGAAACA GAAGGATACA CGATACAACG 600TCAAATAGCG GTGAAATGGG ACGATTCTAT GAAGGAGCGA GGTTATACCA AGTATGTGTC 660TAACCCCTAT AATTCGCATA CAGTAGGAGA TCCCTACACA GATTGGGAAA AAGCGGCTGG 720ACGCATTGAT CAGGCAATCA AAGTAGAAGC TAGGAATCCA TTAGTTGCAG CCTATCCAAC 780AGTTGGTGTA CATATGGAAA AACTGATTGT TTCTGAGAAA CAAAATATAT CAACTGGGGT 840TGGAAAAACA GTATCTGCGG CTATGTCCAC TGGTAATACC GCAGCGATTA CGGCAGGAAT 900TGATGCGACC GCCGGGGCAT CTTTACTTGG ACCTTCTGGA AGTGTGACGG CTCATTTTTC 960TTATACAGGG TCTAGTACAT CTACAATTGA AAATAGTTCA AGCAATAATT GGAGTAAAGA 1020TCTGGGAATC GATACGGGGC AATCTGCTTA TTTAAATGCC AATGTACGAT ATA 1073 357amino acids amino acid single linear peptide 196J4 40 Gly Leu Ile GlyTyr Tyr Phe Gln Asp Gln Lys Phe Gln Gln Leu Al 1 5 10 15 Leu Met Ala HisArg Gln Ala Ser Asn Leu Asn Ile Pro Lys Asn Gl 20 25 30 Val Lys Gln LeuLeu Ser Glu Asp Gln Gln His Ile Gln Ser Val Ar 35 40 45 Trp Ile Gly TyrIle Lys Ser Pro Gln Thr Gly Asp Tyr Ile Leu Se 50 55 60 Thr Ser Ala AspArg His Val Val Ile Glu Leu Asp Gly Lys Thr Il 65 70 75 80 Leu Asn GlnSer Ser Met Thr Ala Pro Ile Gln Leu Glu Lys Asp Ly 85 90 95 Leu Tyr LysIle Arg Ile Glu Tyr Val Pro Glu Asp Thr Lys Gly Gl 100 105 110 Glu AsnLeu Phe Asp Phe Gln Leu Asn Trp Ser Ile Ser Gly Asp Ly 115 120 125 ValGlu Pro Ile Pro Glu Asn Ala Phe Leu Leu Pro Asp Phe Ser Hi 130 135 140Lys Gln Asp Gln Glu Lys Ile Ile Pro Glu Ala Ser Leu Phe Gln Gl 145 150155 160 Gln Glu Asp Ala Asn Lys Val Ser Arg Asn Lys Arg Ser Ile Ala Th165 170 175 Gly Ser Leu Tyr Asp Thr Asp Asp Asp Ala Ile Tyr Asp Glu TrpGl 180 185 190 Thr Glu Gly Tyr Thr Ile Gln Arg Gln Ile Ala Val Lys TrpAsp As 195 200 205 Ser Met Lys Glu Arg Gly Tyr Thr Lys Tyr Val Ser AsnPro Tyr As 210 215 220 Ser His Thr Val Gly Asp Pro Tyr Thr Asp Trp GluLys Ala Ala Gl 225 230 235 240 Arg Ile Asp Gln Ala Ile Lys Val Glu AlaArg Asn Pro Leu Val Al 245 250 255 Ala Tyr Pro Thr Val Gly Val His MetGlu Lys Leu Ile Val Ser Gl 260 265 270 Lys Gln Asn Ile Ser Thr Gly ValGly Lys Thr Val Ser Ala Ala Me 275 280 285 Ser Thr Gly Asn Thr Ala AlaIle Thr Ala Gly Ile Asp Ala Thr Al 290 295 300 Gly Ala Ser Leu Leu GlyPro Ser Gly Ser Val Thr Ala His Phe Se 305 310 315 320 Tyr Thr Gly SerSer Thr Ser Thr Ile Glu Asn Ser Ser Ser Asn As 325 330 335 Trp Ser LysAsp Leu Gly Ile Asp Thr Gly Gln Ser Ala Tyr Leu As 340 345 350 Ala AsnVal Arg Tyr 355 1046 base pairs nucleic acid single linear DNA (genomic)197T1 41 TGGATTAATT GGGTATTATT TTAAAGGAAA AGATTTTAAT AATCTTACTATATTTGCTCC 60 AACACGTGAG AATACTCTTA TTTATGATTT AGAAACAGCG AATTCTTTATTAGATAAGCA 120 ACAACAAACC TATCAATCTA TTCGTTGGAT CGGTTTAATA AAAAGCAAAAAAGCTGGAGA 180 TTTTACCTTT CAATTATCGG ATGATGAGCA TGCTATTATA GAAATCGATGGGAAAGTTAT 240 TTCGCAAAAA GGCCAAAAGA AACAAGTTGT TCATTTAGAA AAAGATAAATTAGTTCCCAT 300 CAAAATTGAA TATCAATCTG ATAAAGCGTT AAACCCAGAC AGTCAAATGTTTAAAGAATT 360 GAAATTATTT AAAATAAATA GTCAAAAACA ATCTCAGCAA GTGCAACAAGACGAATTGAG 420 AAATCCTGAA TTTGGTAAAG AAAAAACTCA AACATATTTA AAGAAAGCATCGAAAAGCAG 480 CTTGTTTAGC AATAAAAGTA AACGAGATAT AGATGAAGAT ATAGATGAGGATACAGATAC 540 AGATGGAGAT GCCATTCCTG ATGTATGGGA AGAAAATGGG TATACCATCAAAGGAAGAGT 600 AGCTGTTAAA TGGGACGAAG GATTAGCTGA TAAGGGATAT AAAAAGTTTGTTTCCAATCC 660 TTTTAGACAG CACACTGCTG GTGACCCCTA TAGTGACTAT GAAAAGGCATCAAAAGATTT 720 GGATTTATCT AATGCAAAAG AAACATTTAA TCCATTGGTG GCTGCTTTTCCAAGTGTCAA 780 TGTTAGCTTG GAAAATGTCA CCATATCAAA AGATGAAAAT AAAACTGCTGAAATTGCGTC 840 TACTTCATCG AATAATTGGT CCTATACAAA TACAGAGGGG GCATCTATTGAAGCTGGAAT 900 TGGACCAGAA GGTTTGTTGT CTTTTGGAGT AAGTGCCAAT TATCAACATTCTGAAACAGT 960 GGCCAAAGAG TGGGGTACAA CTAAGGGAGA CGCAACACAA TATAATACAGCTTCAGCAGG 1020 ATATCTAAAT GCCAATGTAC GATATA 1046 348 amino acids aminoacid single linear peptide 197T1 42 Gly Leu Ile Gly Tyr Tyr Phe Lys GlyLys Asp Phe Asn Asn Leu Th 1 5 10 15 Ile Phe Ala Pro Thr Arg Glu Asn ThrLeu Ile Tyr Asp Leu Glu Th 20 25 30 Ala Asn Ser Leu Leu Asp Lys Gln GlnGln Thr Tyr Gln Ser Ile Ar 35 40 45 Trp Ile Gly Leu Ile Lys Ser Lys LysAla Gly Asp Phe Thr Phe Gl 50 55 60 Leu Ser Asp Asp Glu His Ala Ile IleGlu Ile Asp Gly Lys Val Il 65 70 75 80 Ser Gln Lys Gly Gln Lys Lys GlnVal Val His Leu Glu Lys Asp Ly 85 90 95 Leu Val Pro Ile Lys Ile Glu TyrGln Ser Asp Lys Ala Leu Asn Pr 100 105 110 Asp Ser Gln Met Phe Lys GluLeu Lys Leu Phe Lys Ile Asn Ser Gl 115 120 125 Lys Gln Ser Gln Gln ValGln Gln Asp Glu Leu Arg Asn Pro Glu Ph 130 135 140 Gly Lys Glu Lys ThrGln Thr Tyr Leu Lys Lys Ala Ser Lys Ser Se 145 150 155 160 Leu Phe SerAsn Lys Ser Lys Arg Asp Ile Asp Glu Asp Ile Asp Gl 165 170 175 Asp ThrAsp Thr Asp Gly Asp Ala Ile Pro Asp Val Trp Glu Glu As 180 185 190 GlyTyr Thr Ile Lys Gly Arg Val Ala Val Lys Trp Asp Glu Gly Le 195 200 205Ala Asp Lys Gly Tyr Lys Lys Phe Val Ser Asn Pro Phe Arg Gln Hi 210 215220 Thr Ala Gly Asp Pro Tyr Ser Asp Tyr Glu Lys Ala Ser Lys Asp Le 225230 235 240 Asp Leu Ser Asn Ala Lys Glu Thr Phe Asn Pro Leu Val Ala AlaPh 245 250 255 Pro Ser Val Asn Val Ser Leu Glu Asn Val Thr Ile Ser LysAsp Gl 260 265 270 Asn Lys Thr Ala Glu Ile Ala Ser Thr Ser Ser Asn AsnTrp Ser Ty 275 280 285 Thr Asn Thr Glu Gly Ala Ser Ile Glu Ala Gly IleGly Pro Glu Gl 290 295 300 Leu Leu Ser Phe Gly Val Ser Ala Asn Tyr GlnHis Ser Glu Thr Va 305 310 315 320 Ala Lys Glu Trp Gly Thr Thr Lys GlyAsp Ala Thr Gln Tyr Asn Th 325 330 335 Ala Ser Ala Gly Tyr Leu Asn AlaAsn Val Arg Tyr 340 345 1002 base pairs nucleic acid single linear DNA(genomic) 197U2 43 TGGGTTAATT GGGTATTATT TTACGGATGA GCAGCATAAGGAAGTAGCTT TTAYTCAATT 60 AGGTGAAAAA AMTACATTAG CAGATTCAGC GAAAATGAAGAAAAACGACA AAAAGATTCT 120 TTCAGCGCAA TGGATTGGWA ATATACAGGT ACCTCAAACAGGGGAATATA CGTTTTCCAC 180 CTCTTCTGAT AAAGATACTA TTTTAAAACT CAATGGGGAAACGATTATTC AAAAATCTAA 240 TATGGAGAAA CCCATATATT TAGAAAAAGA TAAAGTATACGAAATTCAAA TCGAGCATAA 300 CAACCCGAAT AGTGAGAAAA CTTTACGATT ATCTTGGAAAATGGGGGGCA CCAATTCAGA 360 GCTCATCCCA GAAAAATACA TTCTGTCTCC CGATTTTTCTAAAATAGCAG ATCAAGAAAA 420 TGARAAAAAA GACGCATCGA GACATTTATT ATTTACTAAGGATGAATTGA AAGATTCTGA 480 TAAGGACCTT ATCCCAGATG AATTTGAAAA AAATGGGTATACATTCAATG GGATTCAAAT 540 TGTTCCTTGG GATGAATCTC TTCAAGAACA GGGCTTTAAAAAATATATTT CCAATCCATA 600 TCAATCGCGT ACAGCGCAGG ATCCATATAC AGATTTTGAAAAAGTAACCG GATATATGCC 660 TGCCGAAACA CAACTGGAAA CGCGTGACCC TTTAGTTGCGGCTTATCCGG CTGTAGGGGT 720 TACGATGGAA CAGTTTATTT TCTCTAAAAA TGATAATGTGCAGGAATCTA ATGGTGGAGG 780 AACTTCAAAA AGTATGACAG AAAGTTCTGA AACGACTTACTCTGTTGAGA TAGGAGGGAA 840 ATTTACATTG AATCCATTCG CACTGGCGGA AATTTCTCCTAAATATTCTC ACAGTTGGAA 900 AAATGGAGCA TCTACAACAG AGGGAGAAAG TACTTCCTGGAGCTCACAAA TTGGTATTAA 960 CACGGCTGAA CGCGCGTTTT TTAAATGCCA ATATTCGATA TA1002 333 amino acids amino acid single linear peptide 197U2 44 Gly LeuIle Gly Tyr Tyr Phe Thr Asp Glu Gln His Lys Glu Val Al 1 5 10 15 Phe XaaGln Leu Gly Glu Lys Xaa Thr Leu Ala Asp Ser Ala Lys Me 20 25 30 Lys LysAsn Asp Lys Lys Ile Leu Ser Ala Gln Trp Ile Xaa Asn Il 35 40 45 Gln ValPro Gln Thr Gly Glu Tyr Thr Phe Ser Thr Ser Ser Asp Ly 50 55 60 Asp ThrIle Leu Lys Leu Asn Gly Glu Thr Ile Ile Gln Lys Ser As 65 70 75 80 MetGlu Lys Pro Ile Tyr Leu Glu Lys Asp Lys Val Tyr Glu Ile Gl 85 90 95 IleGlu His Asn Asn Pro Asn Ser Glu Lys Thr Leu Arg Leu Ser Tr 100 105 110Lys Met Gly Gly Thr Asn Ser Glu Leu Ile Pro Glu Lys Tyr Ile Le 115 120125 Ser Pro Asp Phe Ser Lys Ile Ala Asp Gln Glu Asn Xaa Lys Lys As 130135 140 Ala Ser Arg His Leu Leu Phe Thr Lys Asp Glu Leu Lys Asp Ser As145 150 155 160 Lys Asp Leu Ile Pro Asp Glu Phe Glu Lys Asn Gly Tyr ThrPhe As 165 170 175 Gly Ile Gln Ile Val Pro Trp Asp Glu Ser Leu Gln GluGln Gly Ph 180 185 190 Lys Lys Tyr Ile Ser Asn Pro Tyr Gln Ser Arg ThrAla Gln Asp Pr 195 200 205 Tyr Thr Asp Phe Glu Lys Val Thr Gly Tyr MetPro Ala Glu Thr Gl 210 215 220 Leu Glu Thr Arg Asp Pro Leu Val Ala AlaTyr Pro Ala Val Gly Va 225 230 235 240 Thr Met Glu Gln Phe Ile Phe SerLys Asn Asp Asn Val Gln Glu Se 245 250 255 Asn Gly Gly Gly Thr Ser LysSer Met Thr Glu Ser Ser Glu Thr Th 260 265 270 Tyr Ser Val Glu Ile GlyGly Lys Phe Thr Leu Asn Pro Phe Ala Le 275 280 285 Ala Glu Ile Ser ProLys Tyr Ser His Ser Trp Lys Asn Gly Ala Se 290 295 300 Thr Thr Glu GlyGlu Ser Thr Ser Trp Ser Ser Gln Ile Gly Ile As 305 310 315 320 Thr AlaGlu Arg Ala Phe Phe Lys Cys Gln Tyr Ser Ile 325 330 1073 base pairsnucleic acid single linear DNA (genomic) 202E1 45 TGGGTTAATT GGGTACTATTTTCAGGATCA AAAGTTTCAA CAACTCGCTT TGATGGCACA 60 TAGACAAGCT TCAGATTTAGAAATACCTAA AAATGAAGTG AAGGATATAT TATCTAAAGA 120 TCAACAACAT ATTCAATCAGTGAGATGGAG GGGGTATATT AAGCCACCTC AAACAGGAGA 180 CTATATATTG TCAACCTCATCCGACCAACA GGTCGTGATT GAACTCGATG GAAAAAACAT 240 TGTCAATCAA ACTTCTATGACAGAACCAAT TCAACTCGAA AAAGATAAAC TCTATAAAAT 300 TAGAATTGAA TATGTCCCAGGAGATACAAA AGGACAAGAG AGCCTCCTTG ACTTTCAACT 360 TAACTGGTCA ATTTCAGGAGATACGGTGGA ACCAATTCCG GAGAATGCAT TTCTGTTACC 420 AGACTTTTCT CATCAACAAGATCAAGAGAA ACTCATCCCT GAAATCAGTC TATTTCAGGA 480 ACAAGGAGAT GAGAAAAAAGTATCTCGTAG TAAGAGGTCT TTAGCTACAA ACCCTCTCCT 540 TGATACAGAT GATGATGGTATTTATGATGA ATGGGAAACG GAAGGATACA CAATACAGGG 600 ACAACTAGCG GTGAAATGGGACGATTCTAT GAAGGAGCGA GGTTATACTA AGTATGTGTC 660 TAACCCTTAC AAGGCTCATACAGTAGGAGA TCCCTACACA GATTGGGAAA AAGCGGCTGG 720 CCGTATCGAT AACGCTGTCAAAGCAGAAGC TAGGAATCCT TTAGTCGCGG CCTATCCAAC 780 TGTTGGTGTA CATATGGAAAGACTAATTGT CTCCGAAAAA CAAAATATAT CAACAGGACT 840 TGGAAAAACC GTATCTGTGTCTATGTCCGC AAGCAATACC GCAGCGATTA CGGCAGGAAT 900 TAATGCAACA GCCGGTGCCTCTTTACTTGG GCCATCTGGA AACGTCACGG CTCATTTTTC 960 TTATACAGGA TCTAGTACATCCACTGTTGA AAATAGCTCA AGTAATAATT GGAGTCAAGA 1020 TCTTGGAATC GATACGGGACAATCTGCGTA TTTAAATGCC AATGTAAGAT ATA 1073 357 amino acids amino acidsingle linear peptide 202E1 46 Gly Leu Ile Gly Tyr Tyr Phe Gln Asp GlnLys Phe Gln Gln Leu Al 1 5 10 15 Leu Met Ala His Arg Gln Ala Ser Asp LeuGlu Ile Pro Lys Asn Gl 20 25 30 Val Lys Asp Ile Leu Ser Lys Asp Gln GlnHis Ile Gln Ser Val Ar 35 40 45 Trp Arg Gly Tyr Ile Lys Pro Pro Gln ThrGly Asp Tyr Ile Leu Se 50 55 60 Thr Ser Ser Asp Gln Gln Val Val Ile GluLeu Asp Gly Lys Asn Il 65 70 75 80 Val Asn Gln Thr Ser Met Thr Glu ProIle Gln Leu Glu Lys Asp Ly 85 90 95 Leu Tyr Lys Ile Arg Ile Glu Tyr ValPro Gly Asp Thr Lys Gly Gl 100 105 110 Glu Ser Leu Leu Asp Phe Gln LeuAsn Trp Ser Ile Ser Gly Asp Th 115 120 125 Val Glu Pro Ile Pro Glu AsnAla Phe Leu Leu Pro Asp Phe Ser Hi 130 135 140 Gln Gln Asp Gln Glu LysLeu Ile Pro Glu Ile Ser Leu Phe Gln Gl 145 150 155 160 Gln Gly Asp GluLys Lys Val Ser Arg Ser Lys Arg Ser Leu Ala Th 165 170 175 Asn Pro LeuLeu Asp Thr Asp Asp Asp Gly Ile Tyr Asp Glu Trp Gl 180 185 190 Thr GluGly Tyr Thr Ile Gln Gly Gln Leu Ala Val Lys Trp Asp As 195 200 205 SerMet Lys Glu Arg Gly Tyr Thr Lys Tyr Val Ser Asn Pro Tyr Ly 210 215 220Ala His Thr Val Gly Asp Pro Tyr Thr Asp Trp Glu Lys Ala Ala Gl 225 230235 240 Arg Ile Asp Asn Ala Val Lys Ala Glu Ala Arg Asn Pro Leu Val Al245 250 255 Ala Tyr Pro Thr Val Gly Val His Met Glu Arg Leu Ile Val SerGl 260 265 270 Lys Gln Asn Ile Ser Thr Gly Leu Gly Lys Thr Val Ser ValSer Me 275 280 285 Ser Ala Ser Asn Thr Ala Ala Ile Thr Ala Gly Ile AsnAla Thr Al 290 295 300 Gly Ala Ser Leu Leu Gly Pro Ser Gly Asn Val ThrAla His Phe Se 305 310 315 320 Tyr Thr Gly Ser Ser Thr Ser Thr Val GluAsn Ser Ser Ser Asn As 325 330 335 Trp Ser Gln Asp Leu Gly Ile Asp ThrGly Gln Ser Ala Tyr Leu As 340 345 350 Ala Asn Val Arg Tyr 355 967 basepairs nucleic acid single linear DNA (genomic) KB33 47 TGGATTACTTGGGTACTATT TTGAAGAACC AAACTTTAAT GACCTTCTAT TAATCACACA 60 AAAAAACAACAGTAATTTAT CTCTAGAAAA AGAACATATT TCATCGTTAT CTAGTATTAG 120 AAATAAAGGCATTCAATCTG CTAGATGGTT AGGTTTTTTA AAACCAAAGC AAACGGATGA 180 ATATGTTTTTTTTAGTCCTT CCAACCATGA AATCATGATT CAAATCGATA ACAAAATTAT 240 TGTAATGGGTAGAAAAATTA TGTTAGAAGA AGGAAAGGTA TATCCAATTC GAATTGAATG 300 CCGCTTTGAAAAAACAAATA ATCTAGATAT AAACTGCGAA CTACTTTGGA CGCATTCTGA 360 TACAAAAGAAATCATTTCTC AAAACTGTTT GCTGGCACCT GATTATCATA ATACAGAATT 420 TTACCCAAAAACAAATTTAT TTGGGGATGT ATCTACTACG ACTAGTGATA CTGATAATGA 480 TGGAATACCAGATGACTGGG AAATTAATGG TTATACGTTT GATGGTACAA ATATAATTCA 540 ATGGAATCCTGCTTATGAAG GGTTATATAC TAAATATATT TCTAACCCTA AACAAGCAAG 600 TACAGTAGGTGATCCATATA CAGATTTAGA GAACGTMCAA AGCTAAAKGG ATCAAAGAAS 660 CARGAAAYCCTTKTAGCAGA AGCTWATCCG AAAAATTGGA BTTAGCATGG AAGAATTACT 720 CRTCTCTKTAWAARTGKTGA TKTWTTCAAA TGCTCAAGAA AATKACTACT TACTTCTAGT 780 AGRACAGAAGGCACTTCASG TAGYGCAGGC ATTGAGGGAG GAGCAGAAGG AAAAAAACCT 840 ACAGGATTGGTTTCAGCCTC CTTTTCGCAT TCATCTTCAA CAACAAACAC AACGGAACAA 900 ATGAATGGAACAATGATTCA TCTTGATACA GGAGAATCAG CGTATTTAAA TGCCAATGTA 960 AGATATA 967972 base pairs nucleic acid single linear DNA (genomic) KB38 48TGGATTACTT GGGTATTATT TTGAAGAACC AAACTTTAAT AACCTTCTAT TAATCACACA 60AAAAAACAAC AGTAATTTAT CTCTAGAAAA AGAACATATT TCATCGTTAT CTAGTATTAG 120AAATAAAGGC ATTCAATCTG CTAGATGGTT AGGTTTTTTA AAACCAGAGC AAACGGATGA 180ATATGTTTTT TTTAGTCCTT CCAACCATGA AATTATGATT CAAATCGATA ACAAAATTAT 240TGTAATGGGT AGAAAAATTA TGTTAGAAAA AGGAAAGGTA TATCCAATTC GAATTGAATG 300CCGCTTTGAA AAAACAAATA ATATAGATAT AAACTGCGAA CTACTTTGGA CGCACTCTGA 360TACAAAAGAA ATCATTTCTC AAAACTTTTT GCTGGCACCT GATTATAACA ATACAGAATT 420TTATCCAAAA ACAAATTTAT TTGGAGATGT ATCTACTACG ACTWAGTGAT ACTGATAATG 480ATGGAATACC AGATGACTGG GAAATTAATG GTTATACCTT TGATGGTACA AATATAATTC 540AGTGGAATTC TGCTTATGAA GGGTTATATA CTAAATATGT TTCTAATCCT AAACAAGCAA 600GTACAGTAGG TGATCCATAT ACAGATTTAG AGAAAGTAAC AGCTCAAATG GATCGAGCAA 660CCTCTCTAGA AGCAAGGAAT CCTTTAGTAG CAGCTTATCC AAAAATTGGA GTTAGCATGG 720AAGAATTACT CATCTCTTTA AATGTTGATT TTTCAAATGC TCAAGAAAAT ACTACTTCTT 780CTAGTAGAAC AGAAGGCACT TCACGTAGCG CAGGCATTGA GGGAGGAGCA GAAGGAAAAA 840AACCTACAGG ATTGGTTTCA GCCTCCTTTT CGCATTCATC TTCAACAACA AACACAACGG 900AACAAATGAA TGGAACAATG ATTCATCTTG ATACAGGAGA ATCAGCGTAT TTAAATGCCA 960ATGTAAGATA TA 972 21 base pairs nucleic acid single linear DNA (genomic)49 CTTGAYTTTA AARATGATRT A 21 21 base pairs nucleic acid single linearDNA (genomic) 50 AATRGCSWAT AAATAMGCAC C 21 1341 base pairs nucleic acidsingle linear DNA (genomic) PS177C8 51 ATGTTTATGG TTTCTAAAAA ATTACAAGTAGTTACTAAAA CTGTATTGCT TAGTACAGTT 60 TTCTCTATAT CTTTATTAAA TAATGAAGTGATAAAAGCTG AACAATTAAA TATAAATTCT 120 CAAAGTAAAT ATACTAACTT GCAAAATCTAAAAATCACTG ACAAGGTAGA GGATTTTAAA 180 GAAGATAAGG AAAAAGCGAA AGAATGGGGGAAAGAAAAAG AAAAAGAGTG GAAACTAACT 240 GCTACTGAAA AAGGAAAAAT GAATAATTTTTTAGATAATA AAAATGATAT AAAGACAAAT 300 TATAAAGAAA TTACTTTTTC TATGGCAGGCTCATTTGAAG ATGAAATAAA AGATTTAAAA 360 GAAATTGATA AGATGTTTGA TAAAACCAATCTATCAAATT CTATTATCAC CTATAAAAAT 420 GTGGAACCGA CAACAATTGG ATTTAATAAATCTTTAACAG AAGGTAATAC GATTAATTCT 480 GATGCAATGG CACAGTTTAA AGAACAATTTTTAGATAGGG ATATTAAGTT TGATAGTTAT 540 CTAGATACGC ATTTAACTGC TCAACAAGTTTCCAGTAAAG AAAGAGTTAT TTTGAAGGTT 600 ACGGTTCCGA GTGGGAAAGG TTCTACTACTCCAACAAAAG CAGGTGTCAT TTTAAATAAT 660 AGTGAATACA AAATGCTCAT TGATAATGGGTATATGGTCC ATGTAGATAA GGTATCAAAA 720 GTGGTGAAAA AAGGGGTGGA GTGCTTACAAATTGAAGGGA CTTTAAAAAA GAGTCTTGAC 780 TTTAAAAATG ATATAAATGC TGAAGCGCATAGCTGGGGTA TGAAGAATTA TGAAGAGTGG 840 GCTAAAGATT TAACCGATTC GCAAAGGGAAGCTTTAGATG GGTATGCTAG GCAAGATTAT 900 AAAGAAATCA ATAATTATTT AAGAAATCAAGGCGGAAGTG GAAATGAAAA ACTAGATGCT 960 CAAATAAAAA ATATTTCTGA TGCTTTAGGGAAGAAACCAA TACCGGAAAA TATTACTGTG 1020 TATAGATGGT GTGGCATGCC GGAATTTGGTTATCAAATTA GTGATCCGTT ACCTTCTTTA 1080 AAAGATTTTG AAGAACAATT TTTAAATACAATCAAAGAAG ACAAAGGATA TATGAGTACA 1140 AGCTTATCGA GTGAACGTCT TGCAGCTTTTGGATCTAGAA AAATTATATT ACGATTACAA 1200 GTTCCGAAAG GAAGTACGGG TGCGTATTTAAGTGCCATTG GTGGATTTGC AAGTGAAAAA 1260 GAGATCCTAC TTGATAAAGA TAGTAAATATCATATTGATA AAGTAACAGA GGTAATTATT 1320 AAGGTGTTAA GCGATATGTA G 1341 446amino acids amino acid single linear peptide PS177C8 52 Met Phe Met ValSer Lys Lys Leu Gln Val Val Thr Lys Thr Val Le 1 5 10 15 Leu Ser Thr ValPhe Ser Ile Ser Leu Leu Asn Asn Glu Val Ile Ly 20 25 30 Ala Glu Gln LeuAsn Ile Asn Ser Gln Ser Lys Tyr Thr Asn Leu Gl 35 40 45 Asn Leu Lys IleThr Asp Lys Val Glu Asp Phe Lys Glu Asp Lys Gl 50 55 60 Lys Ala Lys GluTrp Gly Lys Glu Lys Glu Lys Glu Trp Lys Leu Th 65 70 75 80 Ala Thr GluLys Gly Lys Met Asn Asn Phe Leu Asp Asn Lys Asn As 85 90 95 Ile Lys ThrAsn Tyr Lys Glu Ile Thr Phe Ser Met Ala Gly Ser Ph 100 105 110 Glu AspGlu Ile Lys Asp Leu Lys Glu Ile Asp Lys Met Phe Asp Ly 115 120 125 ThrAsn Leu Ser Asn Ser Ile Ile Thr Tyr Lys Asn Val Glu Pro Th 130 135 140Thr Ile Gly Phe Asn Lys Ser Leu Thr Glu Gly Asn Thr Ile Asn Se 145 150155 160 Asp Ala Met Ala Gln Phe Lys Glu Gln Phe Leu Asp Arg Asp Ile Ly165 170 175 Phe Asp Ser Tyr Leu Asp Thr His Leu Thr Ala Gln Gln Val SerSe 180 185 190 Lys Glu Arg Val Ile Leu Lys Val Thr Val Pro Ser Gly LysGly Se 195 200 205 Thr Thr Pro Thr Lys Ala Gly Val Ile Leu Asn Asn SerGlu Tyr Ly 210 215 220 Met Leu Ile Asp Asn Gly Tyr Met Val His Val AspLys Val Ser Ly 225 230 235 240 Val Val Lys Lys Gly Val Glu Cys Leu GlnIle Glu Gly Thr Leu Ly 245 250 255 Lys Ser Leu Asp Phe Lys Asn Asp IleAsn Ala Glu Ala His Ser Tr 260 265 270 Gly Met Lys Asn Tyr Glu Glu TrpAla Lys Asp Leu Thr Asp Ser Gl 275 280 285 Arg Glu Ala Leu Asp Gly TyrAla Arg Gln Asp Tyr Lys Glu Ile As 290 295 300 Asn Tyr Leu Arg Asn GlnGly Gly Ser Gly Asn Glu Lys Leu Asp Al 305 310 315 320 Gln Ile Lys AsnIle Ser Asp Ala Leu Gly Lys Lys Pro Ile Pro Gl 325 330 335 Asn Ile ThrVal Tyr Arg Trp Cys Gly Met Pro Glu Phe Gly Tyr Gl 340 345 350 Ile SerAsp Pro Leu Pro Ser Leu Lys Asp Phe Glu Glu Gln Phe Le 355 360 365 AsnThr Ile Lys Glu Asp Lys Gly Tyr Met Ser Thr Ser Leu Ser Se 370 375 380Glu Arg Leu Ala Ala Phe Gly Ser Arg Lys Ile Ile Leu Arg Leu Gl 385 390395 400 Val Pro Lys Gly Ser Thr Gly Ala Tyr Leu Ser Ala Ile Gly Gly Ph405 410 415 Ala Ser Glu Lys Glu Ile Leu Leu Asp Lys Asp Ser Lys Tyr HisIl 420 425 430 Asp Lys Val Thr Glu Val Ile Ile Lys Val Leu Ser Asp Met435 440 445 17 base pairs nucleic acid single linear DNA (genomic) 53GGATTCGTTA TCAGAAA 17 17 base pairs nucleic acid single linear DNA(genomic) 54 CTGTYGCTAA CAATGTC 17 8 amino acids amino acid singlelinear peptide 55 Ala Asp Glu Pro Phe Asn Ala Asp 1 5 21 base pairsnucleic acid single linear DNA (genomic) 56 GCTGATGAAC CATTTAATGC C 21 8amino acids amino acid single linear peptide 57 Leu Phe Lys Val Asp ThrLys Gln 1 5 22 base pairs nucleic acid single linear DNA (genomic) 58CTCTTTAAAG TAGATACTAA GC 22 9 amino acids amino acid single linearpeptide 59 Pro Asp Glu Asn Leu Ser Asn Ile Glu 1 5 24 base pairs nucleicacid single linear DNA (genomic) 60 GATGAGAACT TATCAAATAG TATC 24 12amino acids amino acid single linear peptide 61 Ala Asn Ser Leu Leu AspLys Gln Gln Gln Thr Tyr 1 5 10 33 base pairs nucleic acid single linearDNA (genomic) 62 CGAATTCTTT ATTAGATAAG CAACAACAAA CCT 33 8 amino acidsamino acid single linear peptide 63 Val Ile Ser Gln Lys Gly Gln Lys 1 524 base pairs nucleic acid single linear DNA (genomic) 64 GTTATTTCGCAAAAAGGCCA AAAG 24 11 amino acids amino acid single linear peptide 65Glu Tyr Gln Ser Asp Lys Ala Leu Asn Pro Asp 1 5 10 31 base pairs nucleicacid single linear DNA (genomic) 66 GAATATCAAT CTGATAAAGC GTTAAACCCA G31 9 amino acids amino acid single linear peptide 67 Ser Ser Leu Phe SerAsn Lys Ser Lys 1 5 23 base pairs nucleic acid single linear DNA(genomic) 68 GCAGCYTGTT TAGCAATAAA AGT 23 8 amino acids amino acidsingle linear peptide 69 Ile Lys Gly Arg Val Ala Val Lys 1 5 20 basepairs nucleic acid single linear DNA (genomic) 70 CAAAGGAAGA GTAGCTGTTA20 9 amino acids amino acid single linear peptide 71 Val Asn Val Ser LeuGlu Asn Val Thr 1 5 25 base pairs nucleic acid single linear DNA(genomic) 72 CAATGTTAGC TTGGAAAATG TCACC 25 8 amino acids amino acidsingle linear peptide 73 Thr Ala Phe Ile Gln Val Gly Glu 1 5 20 basepairs nucleic acid single linear DNA (genomic) 74 AGCATTTATT CAAGTAGGAG20 7 amino acids amino acid single linear peptide 75 Tyr Leu Leu Ser ThrSer Ser 1 5 19 base pairs nucleic acid single linear DNA (genomic) 76TCTACTTTCC ACGTCCTCT 19 7 amino acids amino acid single linear peptide77 Gln Ile Gln Pro Gln Gln Arg 1 5 19 base pairs nucleic acid singlelinear DNA (genomic) 78 CAGATACAAC CGCAACAGC 19 8 amino acids amino acidsingle linear peptide 79 Pro Gln Gln Arg Ser Thr Gln Ser 1 5 23 basepairs nucleic acid single linear DNA (genomic) 80 CCGCAACAGC GTTCAACTCAATC 23 7 amino acids amino acid single linear peptide 81 Asp Gly Ala IleVal Ala Trp 1 5 21 base pairs nucleic acid single linear DNA (genomic)82 GACGGTGCGA TTGTTGCCTG G 21 7 amino acids amino acid single linearpeptide 83 Glu Gly Asp Ser Gly Thr Val 1 5 19 base pairs nucleic acidsingle linear DNA (genomic) 84 GAAGGAGACT CAGGTACTG 19 6 amino acidsamino acid single linear peptide 85 Thr Val Thr Asn Thr Ser 1 5 19 basepairs nucleic acid single linear DNA (genomic) 86 CCGTAACCAA TACAAGCAC19 9 amino acids amino acid single linear peptide 87 Ser Ser Gln Leu AlaTyr Asn Pro Ser 1 5 25 base pairs nucleic acid single linear DNA(genomic) 88 CTTCACAATT AGCGTATAAT CCTTC 25 7 amino acids amino acidsingle linear peptide 89 Glu Gln His Lys Glu Val Ala 1 5 19 base pairsnucleic acid single linear DNA (genomic) 90 GAGCAGCATA AGGAAGTAG 19 8amino acids amino acid single linear peptide 91 Phe Asn Gly Ile Gln IleVal Pro 1 5 25 base pairs nucleic acid single linear DNA (genomic) 92CATTCAATGG GATTCAAATT GTTCC 25 8 amino acids amino acid single linearpeptide 93 Val Gln Glu Ser Asn Gly Gly Gly 1 5 23 base pairs nucleicacid single linear DNA (genomic) 94 GTGCAGGAAT CTAATGGTGG AGG 23 9 aminoacids amino acid single linear peptide 95 Glu Ile Gly Gly Lys Phe ThrLeu Asn 1 5 22 base pairs nucleic acid single linear DNA (genomic) 96GATAGGAGGG AAATTTACAT TG 22 19 base pairs nucleic acid single linear DNA(genomic) 97 CGAATTGAAT GCCGCTTTG 19 22 base pairs nucleic acid singlelinear DNA (genomic) 98 CTCAAAACTK TTTGCTGGCA CC 22 20 base pairsnucleic acid single linear DNA (genomic) 99 GGATCRAGCA ACCTCTCTAG 20 18base pairs nucleic acid single linear DNA (genomic) 100 ACTACTTACTTCTAGTAG 18 8 amino acids amino acid single linear peptide 101 Ser AspGln Gln Val Val Ile Glu 1 5 21 base pairs nucleic acid single linear DNA(genomic) 102 CCGAYCRACA KGTCRTRATT G 21 7 amino acids amino acid singlelinear peptide 103 Asn Gln Thr Ser Met Thr Glu 1 5 21 base pairs nucleicacid single linear DNA (genomic) 104 TCARDCTTCT ATGACAGMAC C 21 8 aminoacids amino acid single linear peptide 105 Gln Asp Gln Glu Lys Ile IlePro 1 5 24 base pairs nucleic acid single linear DNA (genomic) 106CAAGATCAAG ARAARMTYAT YCCT 24 7 amino acids amino acid single linearpeptide 107 Ser His Lys Gln Asp Gln Glu 1 5 18 base pairs nucleic acidsingle linear DNA (genomic) 108 CTCRTMAACA AGATCAAG 18 7 amino acidsamino acid single linear peptide 109 Ser Gly Ser Val Thr Ala His 1 5 18base pairs nucleic acid single linear DNA (genomic) 110 CTGGAARYGTSACGGCTC 18 22 base pairs nucleic acid single linear DNA (genomic) 111GCTTAGTATC TACTTTAAAG AG 22 24 base pairs nucleic acid single linear DNA(genomic) 112 GATACTATTT GATAAGTTCT CATC 24 24 base pairs nucleic acidsingle linear DNA (genomic) 113 CTTTTGGCCT TTTTGCGAAA TAAC 24 31 basepairs nucleic acid single linear DNA (genomic) 114 CTGGGTTTAA CGCTTTATCAGATTGATATT C 31 23 base pairs nucleic acid single linear DNA (genomic)115 ACTTTTATTG CTAAACARGC TGC 23 20 base pairs nucleic acid singlelinear DNA (genomic) 116 TAACAGCTAC TCTTCCTTTG 20 25 base pairs nucleicacid single linear DNA (genomic) 117 GGTGACATTT TCCAAGCTAA CATTG 25 19base pairs nucleic acid single linear DNA (genomic) 118 AGAGGACGTGGAAAGTAGA 19 19 base pairs nucleic acid single linear DNA (genomic) 119GCTGTTGCGG TTGTATCTG 19 23 base pairs nucleic acid single linear DNA(genomic) 120 GATTGAGTTG AACGCTGTTG CGG 23 21 base pairs nucleic acidsingle linear DNA (genomic) 121 CCAGGCAACA ATCGCACCGT C 21 19 base pairsnucleic acid single linear DNA (genomic) 122 CAGTACCTGA GTCTCCTTC 19 19base pairs nucleic acid single linear DNA (genomic) 123 GTGCTTGTATTGGTTACGG 19 25 base pairs nucleic acid single linear DNA (genomic) 124GAAGGATTAT ACGCTAATTG TGAAG 25 25 base pairs nucleic acid single linearDNA (genomic) 125 GGAACAATTT GAATCCCATT GAATG 25 23 base pairs nucleicacid single linear DNA (genomic) 126 CCTCCACCAT TAGATTCCTG CAC 23 22base pairs nucleic acid single linear DNA (genomic) 127 CAATGTAAATTTCCCTCCTA TC 22 22 base pairs nucleic acid single linear DNA (genomic)128 GGTGCCAGCA AAMAGTTTTG AG 22 20 base pairs nucleic acid single linearDNA (genomic) 129 CTAGAGAGGT TGCTYGATCC 20 18 base pairs nucleic acidsingle linear DNA (genomic) 130 CTACTAGAAG TAAGTAGT 18 21 base pairsnucleic acid single linear DNA (genomic) 131 GGTKCTGTCA TAGAAGHYTG A 2124 base pairs nucleic acid single linear DNA (genomic) 132 AGGRATRAKYTTYTCTTGAT CTTG 24 18 base pairs nucleic acid single linear DNA(genomic) 133 CTTGATCTTG TTKAYGAG 18 18 base pairs nucleic acid singlelinear DNA (genomic) 134 GAGCCGTSAC RYTTCCAG 18 21 base pairs nucleicacid single linear DNA (genomic) 135 CCAGTCCAAT GAACCTCTTA C 21 21 basepairs nucleic acid single linear DNA (genomic) 136 AGGGAACAAA CCTTCCCAACC 21 20 base pairs nucleic acid single linear DNA (genomic) 137CARMTAKTAA MTAGGGATAG 20 22 base pairs nucleic acid single linear DNA(genomic) 138 AGYTTCTATC GAAGCTGGGR ST 22 1035 base pairs nucleic acidsingle linear DNA (genomic) 139 GGGTTAATTG GGTATTATTT TAAAGGGAAAGATTTTAATA ATCTGACTAT GTTTGCACCA 60 ACCATAAATA ATACGCTTAT TTATGATCGGCAAACAGCAG ATACACTATT AAATAAGCAG 120 CAACAAGAGT TCAATTCTAT TCGATGGATTGGTTTAATAC AAAGTAAAGA AACAGGTGAC 180 TTTACATTCC AATTATCAGA TGATAAAAATGCCATCATTG AAATAGATGG AAAAGTTGTT 240 TCTCGTAGAG GAGAAGATAA ACAAACTATCCATTTAGAAA AAGGAAAGAT GGTTCCAATC 300 AAAATTGAGT ACCAGTCCAA TGAACCTCTTACTGTAGATA GTAAAGTATT TAACGATCTT 360 AAACTATTTA AAATAGATGG TCATAATCAATCGCATCAAA TACAGCAAGA TGATTTGAAA 420 ATCCTGAATT TAATAAAAAG GAAACGAAAGAGCTTTTATC AAAAACAGCA AAAAGAACCT 480 TTTCTCTTCA AAACGGGGTT GAGAAGCGATGAGGATGATG ATCTAGGATA CAGATGGTGA 540 TAGCATTCCT GGATAATTGG GAAATGAATGGATATACCAT TCAAACGAAA AATGGCAGTC 600 AAATGGGATG ATTCATTTGC AGAAAAAGGATATACAAAAT TTGTTTCGAA TCCATATGAA 660 GCCCATACAG CAGGAGATCC TTATACCGATTATGAAAAAG CAGCAAAAGA TATTCCTTTA 720 TCGAACGCAA AAGAAGCCTT TAATCCTCTTGTAGCTGCTT TTCCATCTGT CAATGTAGGA 780 TTAGAAAAAG TAGTAATTTC TAAAAATGAGGATATGAGTC AGGGTGTATC ATCCAGCACT 840 TCGAATAGTG CCTCTAATAC AAATTCAATTGGTGTTACCG TAGATGCTGG TTGGGAAGGT 900 TTGTTCCCTA AATTTGGTAT TTCAACTAATTATCAAAACA CATGGACCAC TGCACAAGAA 960 TGGGGCTCTT CTAAAGAAGA TTCTACCCATATAAATGGAG CACAATCAGC CTTTTTAAAT 1020 GCAAATGTAC GATAT 1035 345 aminoacids amino acid single linear protein 140 Gly Leu Ile Gly Tyr Tyr PheLys Gly Lys Asp Phe Asn Asn Leu Th 1 5 10 15 Met Phe Ala Pro Thr Ile AsnAsn Thr Leu Ile Tyr Asp Arg Gln Th 20 25 30 Ala Asp Thr Leu Leu Asn LysGln Gln Gln Glu Phe Asn Ser Ile Ar 35 40 45 Trp Ile Gly Leu Ile Gln SerLys Glu Thr Gly Asp Phe Thr Phe Gl 50 55 60 Leu Ser Asp Asp Lys Asn AlaIle Ile Glu Ile Asp Gly Lys Val Va 65 70 75 80 Ser Arg Arg Gly Glu AspLys Gln Thr Ile His Leu Glu Lys Gly Ly 85 90 95 Met Val Pro Ile Lys IleGlu Tyr Gln Ser Asn Glu Pro Leu Thr Va 100 105 110 Asp Ser Lys Val PheAsn Asp Leu Lys Leu Phe Lys Ile Asp Gly Hi 115 120 125 Asn Gln Ser HisGln Ile Gln Gln Asp Asp Leu Lys Ile Leu Asn Le 130 135 140 Ile Lys ArgLys Arg Lys Ser Phe Tyr Gln Lys Gln Gln Lys Glu Pr 145 150 155 160 PheLeu Phe Lys Thr Gly Leu Arg Ser Asp Glu Asp Asp Asp Leu Gl 165 170 175Tyr Arg Trp Xaa Xaa His Ser Trp Ile Ile Gly Lys Xaa Met Asp Il 180 185190 Pro Phe Lys Arg Lys Met Ala Val Lys Trp Asp Asp Ser Phe Ala Gl 195200 205 Lys Gly Tyr Thr Lys Phe Val Ser Asn Pro Tyr Glu Ala His Thr Al210 215 220 Gly Asp Pro Tyr Thr Asp Tyr Glu Lys Ala Ala Lys Asp Ile ProLe 225 230 235 240 Ser Asn Ala Lys Glu Ala Phe Asn Pro Leu Val Ala AlaPhe Pro Se 245 250 255 Val Asn Val Gly Leu Glu Lys Val Val Ile Ser LysAsn Glu Asp Me 260 265 270 Ser Gln Gly Val Ser Ser Ser Thr Ser Asn SerAla Ser Asn Thr As 275 280 285 Ser Ile Gly Val Thr Val Asp Ala Gly TrpGlu Gly Leu Phe Pro Ly 290 295 300 Phe Gly Ile Ser Thr Asn Tyr Gln AsnThr Trp Thr Thr Ala Gln Gl 305 310 315 320 Trp Gly Ser Ser Lys Glu AspSer Thr His Ile Asn Gly Ala Gln Se 325 330 335 Ala Phe Leu Asn Ala AsnVal Arg Tyr 340 345 1037 base pairs nucleic acid single linear DNA(genomic) 141 GGGTTAATTG GGTATTATTT TAAAGGGAAA GATTTTAATA ATCTGACTATGTTTGCACCA 60 ACCATAAATA ATACGCTTAT TTATGATCGG CAAACAGCAG ATACACTATTAAATAAGCAG 120 CAACAAGAGT TCAATTCTAT TCGATGGATT GGTTTAATAC AAAGTAAAGAAACAGGTGAC 180 TTTACATTCC AATTATCAGA TGATAAAAAT GCCATCATTG AAATAGATGGAAAAGTTGTT 240 TCTCGTAGAG GAGAAGATAA ACAAACTATC CATTTAGAAA AAGGAAAGATGGTTCCAATC 300 AAAATTGAGT ACCAGTCCAA TGAACCTCTT ACTGTAGATA GTAAAGTATTTAACGATCTT 360 AAACTATTTA AAATAGATGG TCATAATCAA TCGCATCAAA TACAGCAAGATGATTTGAAA 420 AATCCTGAAT TTAATAAAAA AGAAACGAAA GAGCTTTTAT CAAAAACAGCAAAAAGRAAC 480 CTTTTCTCTT CAAACGRRGT KGAGAAGCGA TGAGGATGAT RATCYTAGATACAGGTGGKG 540 ATAGCATTCC YKGATAATTG GGGAAATGAA WGGRTATACC ATTCAACSGAAAAATGGSAG 600 TCAAATGGGA TGATTCATTT GCGGAAAAAG GATATACAAA ATTTGTTTCGAATCCATATG 660 AAGCCCATAC AGCAGGAGAT CCTTATACCG ATTATGAAAA AGCAGCAAAAGATATTCCTT 720 TATCGAACGC AAAAGAAGCC TTTAATCCTC TTGTAGCTGC TTTTCCATCTGTCAATGTAG 780 GATTAGAAAA AGTAGTAATT TCTAAAAATG AGGATATGAG TCAGGGTGTATCATCCAGCA 840 CTTCGAATAG TGCCTCTAAT ACAAATTCAA TTGGTGTTAC CGTAGATGCTGGTTGGGAAG 900 GTTTGTTCCC TAAATTTGGT ATTTCAACTA ATTATCAAAA CACATGGACCACTGCACAAG 960 AATGGGGCTC TTCTAAAGAA GATTCTACCC ATATAAATGG AGCACAATCAGCCTTTTTAA 1020 ATGCAAATGT ACGATAT 1037 1048 base pairs nucleic acidsingle linear DNA (genomic) 142 TGGGTTAATT GGGTATTATT TTAAAGGGCAAGAGTTTAAT CATCTTACTT TGTTCGCACC 60 AACACGTGAT AATACCCTTA TTTATGATCAACAAACAGCG AATTCCTTAT TAGATACCAA 120 GCAACAAGAA TATCAATCTA TTCGCTGGATTGGTTTAATT CAAAGTAAAG AAACGGGTGA 180 TTTCACATTT AACTTATCAG ATGATCAACATGCAATTATA GAAATCGATG GCAAAATCAT 240 TTCGCATAAA GGACAGAATA AACAAGTTGTTCACTTAGAA AAAGGAAAGT TAGTCCCGAT 300 AAAAATTGAG TATCAATCAG ATCAACTATTAAATAGGGAT AGTAACATCT TTAAAGAGTT 360 TAAATTATTC AAAGTAGATA GTCAGCAACACGCTCACCAA GTTCAACTAG ACGAATTAAG 420 AAACCCTGCG TTTAATAAAA AGGAAACACAACAATCTTAA GAAAAAGCAT CCAAAAACAA 480 TCTTTTTACA CCAGGGACAT TAAAAGGAAGATACTGATGA TGATGATAAG GATAACAGGA 540 TGGGAGATTC TATTCCTGGA CCTTTTGGGGGAAGAAAATG GGTATACCAA TCCCAAAATA 600 AAATAGCTGG TCCAAGTGGG ATGTTCATTCGCCGCGAAAG GGTATACAAA TTTGTTTCTT 660 AATCCACTTG ATAGTCATAC AGTTGGAGATCCCTATACGG ATTATGAAAA AGCAGCAAGA 720 GATTTAGACT TGGCCCAATG CAAAAGAAACATTTAACCCA TTAGTAGCTG CTTTTCCAAG 780 TGTGAATGTG AATTTGGAAA AAGTCATTTTATCTAAAGAT GAAAATCTAT CCAATAGTGT 840 AGAGTCACAT TCCTCCACCA ACTGGTCTTATACGAATACA GAAGGAGCTT CTATCGAAGC 900 TGGGGCTAAA CCAGAGGGTC CTACTTTTGGAGTGAGTGCT ACTTATCAAC ACTCTGAAAC 960 AGTTGCAAAA GAATGGGGAA CATCTACAGGAAATACCTCG CAATTTAATA CAGCTTCAGC 1020 AGGATATTTA AATGCAAATG TACGATAT1048 1175 base pairs nucleic acid single linear DNA (genomic) 143ACCTCTAGAT GCANGCTCGA GCGGCCGCCA GTGTGATGGA TATCTGCAGA ATTCGGATTA 60CTTGGGTATT ATTTTAAAGG GAAAGAGTTT AATCATCTTA CTTTGTTCGC ACCAACACGT 120GATAATACCC TTATTTATGA TCAACAAACA GCGAATTCCT TATTAGATAC CAAACAACAA 180GAATATCAAT CTATTCGCTG GATTGGTTTG ATTCAAAGTA AAGAAACAGG TGATTTCACG 240TTTAACTTAT CTGATGATCA AAATGCAATT ATAGAAATAG ATGGCAAAAT CATTTCGCAT 300AAAGGACAGA ATAAACAAGT TGTTCACTTA GAAAAAGGAA AGTTAGTCCC GATAAAAATT 360GAGTATCAAT CAGATCAGAT ATTAACTAGG GATAGTAACA TCTTTAAAGA GTTCAATTAT 420TCAAAGTAGA TAGTCAAGCA ACACTCTCAC CAAAGTTCAA CTTAGGNCNG AATTAAGNAA 480CCCTNGGATT TTAANTTNAA AAAAAGGAAC CCNCANCATT CTTTAGGAAA AAGCAGCAAN 540AACCAAATCC TTTTTTACCA CAGGATATTG AAAAGGAGAT ACGGGNTNGA TGATGGATTG 600ATACCGGGAT ACCAGTTGGG GNTTCTANTC CCTGACCTTT GGGGAAAGAA AATNGGTATA 660CCNATCCCAA AANTTAAGCC AGCTGTCCAG GTGGGATGAT TCAATTCGCC CGCGAAAGGG 720TATACCAAAA TTTGTTTCTT AATCCACTTG AGAGTCATAC AGTTGGAGAT CCCTATACGG 780ATTATGAAAA AGCAGCAAGA GATTTAGACT TGGCCAATGC AAAAGAAACA TTTAACCCAT 840TAGTAGCTGC TTTTCCAAGT GTGAATGTGA ATTTGGAAAA AGTAATATTA TCCCCAGATG 900AGAATTTATC TAACAGTGTA GAATCTCATT CGTCTACAAA TTGGTCTTAT ACGAATACTG 960AAGGAGCTTC TATCGAAGCT GGGGGTGGTC CATTAGGTAT TTCATTTGGA GTGAGTGCTA 1020ATTATCAACA CTCTGAAACA GTTGCAAAAG AATGGGGAAC ATCTACAGGA AATACCTCGC 1080AATTTAATAC AGCTTCAGCA GGATATTTAA ATGCCAATGG TCGATNTAAG CCGAATNCCA 1140NCACACTGNC GGCCGTTAGT AGTGGCACCG AGCCC 1175 1030 base pairs nucleic acidsingle linear DNA (genomic) 144 GGRTTAMTTG GGTATTATTT TAAAGGGAAAGATTTTAATG ATCTTACTGT ATTTGCACCA 60 ACGCGTGGGA ATACTCTTGT ATATGATCAACAAACAGCAA ATACATTACT AAATCAAAAA 120 CAACAAGACT TTCAGTCTAT TCGTTGGGTTGGTTTAATTC AAAGTAAAGA AGCAGGCGAT 180 TTTACATTTA ACTTATCAGA TGATGAACATACGATGATAG AAATCGATGG GAAAGTTATT 240 TCTAATAAAG GGAAAGAAAA ACAAGTTGTCCATTTAGAAA AAGGACAGTT CGTTTCTATC 300 AAAATAGAAT ATCAAGCTGA TGAACCATTTAATGCGGATA GTCAAACCTT TAAAAATTTG 360 AAACTCYTTA AAGTAGATAC TAAGCAACAGTCCCAGCAAA TTCAACTAGA TGAATTAAGA 420 AACCCTGRAA TTTAATAAAA AAGAAACACAAGAATTTCTA ACAAAAGCAA CAAAAACAAA 480 CCTTATTACT CAAAAAGTGA AGAGTACTAGGGATGAAGAC ACGGATACAG ATGGAGATTC 540 TATTCCAGAC ATTTGGGAAG AAAATGGGTATACCATCCAA AATAAGATTG CCGTCAAATG 600 GGATGATTCA TTAGCAAGTA AAGGATATACGAAATTTGTT TCAAACCCAC TAGATACTCA 660 CACGGTTGGA GATCCTTATA CAGATTATGAAAAAGCAGCA AGGGATTTAG ATTTGTCAAA 720 TGCAAAAGAA ACATTTAACC CATTAGTTGCGGCTTTTCCA AGTGTGAATG TGAGTATGGA 780 AAAAGTGATA TTGTCTCCAG ATGAGAACTTATCAAATAGT ATCGAGTCTC ATTCATCTAC 840 GAATTGGTCG TATACGAATA CAGAAGGGGCTTCTATTGAA GCTGGTGGGG GAGCATTAGG 900 CCTATCTTTT GGTGTAAGTG CAAACTATCAACATTCTGAA ACAGTTGGGT ATGAATGGGG 960 AACATCTACG GGAAATACTT CGCAATTTAATACAGCTTCA GCGGGGTATT TAAATGCCAA 1020 TRTAMGATAT 1030

1. An isolated protein that is pesticidal wherein a. said protein hastoxin activity against a lepidopteran pest, and wherein said proteincomprises an amino acid sequence coded for by a nucleotide sequenceselected from the group consisting of SEQ ID NO:10, SEQ ID NO:12, andSEQ ID NO:15; or b. said protein has toxin activity against a cornrootworm pest, and wherein said protein comprises an amino acid sequenceselected from the group consisting of SEQ ID NO:19, SEQ ID NO:21, SEQ IDNO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:36, SEQ IDNO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, and SEQ ID NO:46; or c.said protein has toxin activity against a corn rootworm pest, andwherein said protein comprises an amino acid sequence coded for by anucleotide sequence selected from the group consisting of SEQ ID NO:30,SEQ ID NO:47, SEQ ID NO:48, SEQ ID NO:141, SEQ ID NO:142, and SEQ IDNO:144.
 2. An isolated polynucleotide that encodes a protein wherein a.said protein has toxin activity against a lepidopteran pest, whereinsaid protein comprises an amino acid sequence coded for by a nucleotidesequence selected from the group consisting of SEQ ID NO:10, SEQ IDNO:12, and SEQ ID NO:15; or b. said protein has toxin activity against acorn rootworm pest, wherein said polynucleotide hybridizes underconditions of 0.1×SSPE at 65° C. with the full complement of anucleotide sequence selected from the group consisting of SEQ ID NO:28,SEQ ID NO:30, SEQ ID NO:35, SEQ ID NO:41, SEQ ID NO:45, SEQ ID NO:48,SEQ ID NO:141, SEQ ID NO:142, and SEQ ID NO:144.
 3. A recombinant hostcomprising an isolated polynucleotide that encodes a protein, whereinsaid host is selected from the group consisting of a plant cell, amicrobial cell, and a plant, and wherein a. said protein has toxinactivity against a lepidopteran pest, and wherein said protein comprisesan amino acid sequence coded for by a nucleotide sequence selected fromthe group consisting of SEQ ID NO:10, SEQ ID NO:12, and SEQ ID NO:15, orb. said protein has toxin activity against a corn rootworm pest, whereinsaid polynucleotide hybridizes under conditions of 0.1×SSPE at 65° C.with the full complement of a nucleotide sequence selected from thegroup consisting of SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:35, SEQ IDNO:41, SEQ ID NO:45, SEQ ID NO:48, SEQ ID NO:141, SEQ ID NO:142, and SEQID NO:144.
 4. A method of controlling a pest wherein said methodcomprises administering an isolated protein to said pest wherein a. saidpest is a lepidopteran pest, and wherein said protein comprises an aminoacid sequence coded for by a nucleotide sequence selected from the groupconsisting of SEQ ID NO:10, SEQ ID NO:12, and SEQ ID NO:15; or b. saidpest is a corn rootworm pest, and wherein said protein comprises anamino acid sequence selected from the group consisting of SEQ ID NO:19,SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29,SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44,and SEQ ID NO:46; or c. said pest is a corn rootworm pest, and whereinsaid protein comprises an amino acid sequence coded for by a nucleotidesequence selected from the group consisting of SEQ ID NO:30, SEQ IDNO:47, SEQ ID NO:48, SEQ ID NO:141, SEQ ID NO:142, and SEQ ID NO:144. 5.A purified culture of a Bacillus thuringiensis isolate selected from thegroup consisting of Bacillus thuringiensis isolates PS10E1, PS31F2,PS33D2, PS66D3, PS68F, PS69AA2, PS146D, PS168G1, PS175I4, PS177C8a,PS177I8, PS185AA2, PS196J4, PS196F3, PS197T1, PS197U2, PS202E1, PS217U2,KB33, KB38, KB53A49-4, KB68B46-2, KB68B51-2, and KB68B55-2, and mutantsthereof.
 6. The protein of claim 1 wherein said protein said proteincomprises an amino acid sequence coded for by the nucleotide sequence ofSEQ ID NO:10.
 7. The protein of claim 1 wherein has toxin activityagainst a lepidopteran pest, and wherein said protein comprises an aminoacid sequence coded for by the nucleotide sequence of SEQ ID NO:12, andsaid protein further comprises an amino acid sequence coded for by thenucleotide sequence of SEQ ID NO:15.
 8. The polynucleotide of claim 2wherein said polynucleotide encodes a protein that has toxin activityagainst a lepidopteran pest, wherein said protein comprises an aminoacid sequence coded for by the nucleotide sequence of SEQ ID NO:10. 9.The polynucleotide of claim 2 wherein said polynucleotide encodes aprotein that has toxin activity against a lepidopteran pest, whereinsaid protein comprises an amino acid sequence coded for by thenucleotide sequence of SEQ ID NO:12, and said protein further comprisesan amino acid sequence coded for by the nucleotide sequence of SEQ IDNO:15.
 10. The host according to claim 3 wherein said protein has toxinactivity against a lepidopteran pest, and wherein said protein comprisesan amino acid sequence coded for by the nucleotide sequence of SEQ IDNO:10.
 11. The host according to claim 3 wherein said protein has toxinactivity against a lepidopteran pest, and wherein said protein comprisesan amino acid sequence coded for by the nucleotide sequence of SEQ IDNO:12, and said protein further comprises an amino acid sequence codedfor by the nucleotide sequence of SEQ ID NO:15.
 12. The host accordingto claim 3 where said host is a plant cell.
 13. The host according toclaim 3 where said host is a bacterial cell.
 14. The host according toclaim 3 where said host is a plant.
 15. The method of claim 4 whereinsaid pest is a lepidopteran pest, and wherein said protein comprises anamino acid sequence coded for by a nucleotide sequence of SEQ ID NO:10.16. The method of claim 4 wherein said pest is a lepidopteran pest, andwherein said protein comprises an amino acid sequence coded for by anucleotide sequence selected from the group consisting of SEQ ID NO:12,and said protein further comprises an amino acid sequence coded for bythe nucleotide sequence of SEQ ID NO:15.
 17. The protein of claim 1wherein said protein has toxin activity against a corn rootworm pest,and wherein said protein comprises an amino acid sequence selected fromthe group consisting of SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ IDNO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:36, SEQ ID NO:38, SEQ IDNO:40, SEQ ID NO:42, SEQ ID NO:44, and SEQ ID NO:46.
 18. The protein ofclaim 1 wherein said protein has toxin activity against a corn rootwormpest, and wherein said protein comprises an amino acid sequence codedfor by a nucleotide sequence selected from the group consisting of SEQID NO:30, SEQ ID NO:47, SEQ ID NO:48, SEQ ID NO:141, SEQ ID NO:142, andSEQ ID NO:144.
 19. The polynucleotide of claim 2 wherein said proteinhas toxin activity against a corn rootworm pest, wherein saidpolynucleotide hybridizes under conditions of 0.1×SSPE at 65° C. withthe full complement of a nucleotide sequence selected from the groupconsisting of SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:35, SEQ ID NO:41,SEQ ID NO:45, SEQ ID NO:48, SEQ ID NO:141, SEQ ID NO:142, and SEQ IDNO:144.
 20. The recombinant host of claim 3 wherein said protein hastoxin activity against a corn rootworm pest, wherein said polynucleotidehybridizes under conditions of 0.1×SSPE at 65° C. with the fullcomplement of a nucleotide sequence selected from the group consistingof SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:35, SEQ ID NO:41, SEQ ID NO:45,SEQ ID NO:48, SEQ ID NO:141, SEQ ID NO:142, and SEQ ID NO:144.
 21. Themethod of claim 4 wherein said pest is a corn rootworm pest, and whereinsaid protein comprises an amino acid sequence selected from the groupconsisting of SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25,SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40,SEQ ID NO:42, SEQ ID NO:44, and SEQ ID NO:46.
 22. The method of claim 4wherein said pest is a corn rootworm pest, and wherein said proteincomprises an amino acid sequence coded for by a nucleotide sequenceselected from the group consisting of SEQ ID NO:30, SEQ ID NO:47, SEQ IDNO:48, SEQ ID NO:141, SEQ ID NO:142, and SEQ ID NO:144.